Sunday, 19 July 2015

Mini Blog: Classical Cryptography's Hill Cypher

The contents of this blog post today will detail about a cipher in classical cryptography. It is not used nowadays because it is easily vulnerable. In fact, the way it works can also be understandable to any high schooler or first year college student who has an elementary-level knowledge of linear algebra.

Say you have a message you want to decrypt, e.g. "I like pie". We convert each character into a numerical representation (e.g. perhaps ASCII). Then we form a matrix out of these characters. Let's call it matrix B.

Let's use the concept of matrix multiplication and invertible matrices in order to decrypt and encrypt this matrix B message.

We can easily encrypt it by matrix-multiplying it with an arbitrary matrix A. The resulting matrix, AB, will be sent as the "encrypted message". Note that of course, the number of columns in matrix A must be equal to the number of rows in matrix B for matrix multiplication of A and B to work.

Now, you can send this encrypted message AB to anyone. But how can they decrypt it? Simple. We can use the property of invertible matrices.

(A^-1)(AB) = B

So, we just need to multiply the inverse of matrix A to the encrypted message AB, to get back the decrypted message B. Note that A must be an invertible matrix then.

-------------------------------------
Resources:

Saturday, 4 July 2015

MiniBlog: Paint Applications, and Splines

My friend and I recently had a conversation where he and his dad were talking about splines, which reminded me of how my professor taught me a good software application for splines.

What are Splines?

I honestly think the Wikipedia definition is much more clear than whatever way I could define a spline, so here you go:

In mathematics, a spline is a numeric function that is piecewise-defined by polynomial functions, and which possesses a sufficiently high degree of smoothness at the places where the polynomial pieces connect (which are known as knots).
In interpolating problems, spline interpolation is often preferred to polynomial interpolation because it yields similar results to interpolating with higher degree polynomials while avoiding instability due to Runge's phenomenon. In computer graphics, parametric curves whose coordinates are given by splines are popular because of the simplicity of their construction, their ease and accuracy of evaluation, and their capacity to approximate complex shapes through curve fitting and interactive curve design.

Polynomial interpretation is where you have a set of dots/points, and you try to fit ONE polynomial function over the whole thing. Runge's phenomenon is where you try to fit a high-degree polynomial function over a set of equidistant data points, but then near the edges, really bad and big oscillations occur, which isn't really good for data analysis and prediction!


Rutger's phenomenon, where the blue curve is a 5th order interpolating polynomial, and the green which is really oscillating, is a 9th order interpolating polynomial

The problem of most simple drawing applications

When I was a newb at android application development, I tried to make a simple drawing application to learn more about touch gestures of a mobile device. Android has good APIs for gesture detection. In your custom Canvas class that you make for your drawing application, you can override the OnTouchEvent method and handle the incoming user gestures.

To draw what the user draws, you simply have to get the x-y point when the user first presses down on the screen, and then draw a line between that point and the point the user moves to next. This means that as the user moves his finger across the screen, Android is getting x-y coordinates every XXX milliseconds, and each XXX milliseconds, Android draws a line between the previous x-y coordinates the user was at and the current x-y coordinates the user is currently at.

This has some problems, of course. Let's say the user swiped his finger really fast in a short amount of time. Because the Android API only gets x-y coordinates of the user's finger vs. time every XXX milliseconds, the resulting line that the user drew on the drawing application doesn't look smooth. It looks like a piece-wise function of straight lines connected to each other by dots, instead of a smooth curve. Below is an illustration I made about this:

Notice the lack of smoothness due to the C shaped curve being composed of multiple straight lines

So, in order to fix this, all you would have to do is take the history of all these points as the user first initially presses down, until the user releases his finger, and using that list of point objects, you use spline interpolation to make the whole line smoother, and then draw that smooth line onto the canvas!

Edge Detection For Dummies

Author's Note: I'm working on an application that uses an edge detection library at Microsoft. Obviously, as a lowly undergraduate intern that I am, all I need to do is call a method and it just magically detects edges on images. I wanted to understand more about edge detection and not just use "a library". This blog post is part of my "for dummies" series, where I try to explain hard concepts in a simple way without super-sophisticated language, so that the average person (like me) can understand the concept.

References for all the information I'm about to explain is at the bottom of this blog post.

Edge detection is what it sounds like. You have an image, and you want to detect "edges". Pictures are worth a thousand words, so the below figure should answer any of your questions on what "edge detection" is.



In the study of edge detection and image processing, an edge is formally defined as a "significant local change in the image intensity, usually associated with a discontinuity in either the image intensity or the first derivative of the image intensity". This sounds really hard, but bear with me here.

Let me first talk about image intensity, and a bit later in this post, I'll get back to the "first derivative of image intensity". Image intensity is a bit hard to describe, but for a normal greyscale image, like the one I'm about to show you below, it can be thought of as "dark vs light", or colour. If you want to learn more about image intensity, you can find it here.

Now I'll try to explain more about what they mean by "discontinuity" and "first derivative of image intensity". Discontinuities in the image intensity can be visualized as step discontinuities, where the image intensity value literally changes from one value to a totally different value. You can represent this as a step function. Discontinuities can also be in the form of line discontinuities, where the image intensity also abruptly changes value, but then returns to the starting value within some short distance. Think of a very thick, vertical, black line you draw in a paint application. As you go across the line, you go from the paint application's default white background, to black, and then to white again.

Now, in a perfect image, these discontinuities would be sharp and we can easily detect them. However, because we don't live an ideal world and because camera apps sometime like to "smoothen" captured photos, sharp discontinuities rarely exist in real photos. The step and line discontinuities you will actually obtain from processing the image will instead look like "ramp functions" or "roof functions". Examples of these two functions are drawn in the below figure.


In addition to sharp discontinuities not existing, another reason why developing a a good edge detection algorithm is hard is because images have "noise" in them, and we need to make algorithms pretty good at being able to filter out this "noise".


As you can see in this above figure, there is noise on the left most and right most part of the graph, where both parts seem to have this "ramp" function... but as you can tell from the actual image above the graph, those ramp-like parts are just noise. There is nothing on the left and right parts of the actual image.

I'm going to now try to write a short list of definitions of some commonly used terms, which will be used later in this blog post. I will try to make these terms as understandable as possible.


Definitions

Edge Point - I'm sure you all know what a point means. An edge point is basically a point in the image that is part of what is considered to be an edge by whatever edge detection algorithm you use. So, basically, it is an (x, y) coordinate at a location where there is a significant image intensity difference.

Edge Fragment - An edge fragment is basically an edge point, along with that particular point's "edge orientation". An edge orientation is basically what direction/orientation the edge is. For example, I could draw a line horizontally on my paint application, and that edge would have a horizontal orientation. Obviously, this edge orientation is an actual angle/mathematical value, but I want to make things simple.

Edge - When image processing experts talk about "edges", they actually mean "edge points" or "edge fragments". Read the definition below on an edge detector if you don't know what I meant from that.

Edge Detector - an edge detecting algorithm that actually produces "edges" (a.k.a. edge points or edge fragments) from an image.

Contour - A contour is a list of "edges" (a.k.a. edge points/edge fragments) or the mathematical curve that models the list of "edges" (a.k.a. edge points/edge fragments), into something most of us define as an "edge".

Edge Linking - "Edge linking is a process of forming an ordered list of "edges" from an unordered list. By convention, "edges" are ordered by traversal in a clockwise direction". To simplify, edge linking is by linking edge points together on a small scale (e.g. within a few pixels of each other, we determine that edge point A and edge point B can be linked).

Edge Following - "Edge following is the process of searching the {filtered} image to determine contours." To simplify, edge following is where you look at the whole image by filtering out noises, and then determining the lines. It is edge detection on a big scale, rather than a small scale.


Edge coordinates could be in the coordinate system of the original image (e.g. an image of pixel width X and pixel height Y), though more likely, it will be in a different coordinate system as many edge detectors use a process called "filtering" in order to filter out noise. Filtering an image actually changes the coordinate system of the image, which may translate or scale the image coordinates.

Now, edge detectors cannot be perfect. The types of edges that can be obtained from an edge detector are: correct edges (so a human testing an edge detector can verify that "yep it identified this edge correctly"), false edges (where the algorithm mistaken a non-edge for an edge), and missing edges (where the algorithm should have identified an edge but didn't). False edges are called false positives, and missing edges are called false negatives.

Now, remember when I talked about the "first derivative of image intensity"? What does this mean? Well, let's take the example of a step edge associated with a significant difference in image intensity. The first derivative, if you've took an intro to calculus course, is just that - the first derivative. The slope. The below figure, taken from a neat website I referenced at the bottom of this blog post, has two images: the first one is the step function (though more like a continuous ramp function), and the graph below that is the derivative of that ramp function - a.k.a. the "first derivative of image intensity".


I will now define the "gradient". The gradient, as the figure above describes, is the first derivative of image intensity, and can be used to detect edges. How? Well, we know that the first graph can symbolize an edge because it goes from a low value to a high value very abruptly. Every time a change like this occurs, the first derivative has a local maxima! So, that means everytime a big enough local maxima occurs, we can determine that there is an edge point!

So, you can imagine that an image is just a huge 2D array of these continuous functions of image intensities, and by interpreting all of these graphs, we can eventually detect the edges in a whole image.

Now, the figure above only represents a 1D gradient and a 1D image intensity. Pictures are 2D, so we need to somehow define a 2D gradient. Below is a screenshot of a page from a textbook that describes the "formal definition" of 2D gradient:


Most edge detection algorithms uses something called a "mask" in order to approximate the gradient values in an image. An example of some different types of masks is shown in the below figure.


How do you use one of these masks, you might ask? The method is pretty simple. Let's say we are using the Sobel mask (which has a mask in the x-direction and the y-direction). For each 3x3 group of pixels, we apply Sobel's x- and y- masks by a process known as "convolution". An easy example of how to do convolution is in this link.

These matrices that get convoluted with the original source image's matrix are not only known as masks, but also filters, because they filter out noisy signals to get the signal that we want - the edges.

There are other types of filters as well, such as ones with sizes of 2x1, 1x2, et cetera. But these these are a bit hard to think about because there isn't really a "center" pixel in that mask/matrix thing. Some of these mask matrices can be very huge as well, which we will talk about below in this blog post.

The simplest numerical approximation for gradient would be:

 Gx = f[i,j + 1]- f[i,j]

and Gy = f[i, j] - f[i + 1,j]

j corresponds to the x direction and i to the negative y direction.

The corresponding masks for this simple gradient approximation are illustrated below:


Algorithms for edge detection generally contain these three steps:

Filtering: Images have noise. Filtering is used to reduce this noise. However, there's actually a trade-off, because by filtering to reduce noise, you could potentially reduce edge strength in the images as well. More filtering to reduce noise results in a loss of edge strength.

Enhancement: Enhancement emphasizes pixels where there are local "first derivative intensity value" maxima and are usually performed by computing the gradient magnitude. There are several ways to compute the gradient (e.g. Roberts, Sobel, etc.), which we will talk about below

Detection: In an image, there might be lots and lots of local maxima when looking at the first derivative of image intensity. How do we actually determine what is an edge and what isn't? Typically, making a threshold of e.g. "if the maxima is above X value, then it's an edge!" is used.

Many edge detectors have been developed in the last two decades. I will discuss now talk about some commonly used edge detectors.

Roberts Operator

The Roberts cross operator provides a simple approximation to the gradient magnitude:


Ok, maybe it doesn't look simple. You don't need to formally know the definitions, as you can just "use the mask". Here's the mask that you would use in the following figure though:


Sobel Operator

The Sobel operator is actually probably one of the most commonly used edge detectors, quoting from the below screenshot of the textbook I read. Also, the below screenshot gives you the Sobel matrix:



Second Derivative Operators

Now that I've talked about how to do edge detection using the gradient of image intensity, I will now try to explain how you can also use the "second derivative of image intensity" in order to do edge detection. The edge detectors I talked about earlier calculated the first derivative and, if it was above a certain X threshold, the presence of an edge point was assumed. This actually results in the detection of too many edge points. A better approach would be to find only the points that have local maxima in gradient values and consider them edge points. This means that at edge points, there will be a peak in the first derivative and, equivalently, there will be a zero crossing in the second derivative. 

Thus, edge points may be detected by finding the zero crossings of the second derivative of the image intensity. This is the second major way of edge detection, instead of just the "gradient" way. 


There are two operators in two dimensions that correspond to the second derivative: the Laplacian and second directional derivative. This sounds hard, but basically, there are two ways at looking at the second derivative, and correspondingly, two different ways of making an algorithm for edge detection using the "zero crossing" method.

I won't explain too much into this, other than the fact that Laplacian and second directional derivative can give you masks that you can use in your edge detection code, just like the "gradient" methods:


I will make a follow up blog-post that goes more into zero-crossing methods of edge detection, including talking about the canny-edge detector algorithm.

-------------
REFERENCES

Friday, 3 July 2015

Mini Blog: Shadows on Celestial Bodies, Umbras, and Umbrellas

I'm going to be starting a series called "Mini Blog", where each post in this series is significantly shorter than the lengthy posts I usually write. In fact, sometimes it may be just a picture. Like this one!


This picture defines umbra, penumbra, and antumbra.

Penumbra in this picture is B, and umbra is A.

The reason why the moon doesn't totally cover up the sun from our perspective on earth during a total eclipse is because the distance between the moon and the earth is around the same distance where the apex of the moon's umbra because of the sun is. Therefore, we see an antumbra instead of a full umbra.

Another fun fact: umbra is the Latin word for shadow. The English word, "umbrella", is also derived from the Latin word "umbra".


Thursday, 2 July 2015

Identifying Concavity/Convexity in Polygons (for Dummies)

The following blog post is purely based off of this link: http://debian.fmi.uni-sofia.bg/~sergei/cgsr/docs/clockwise.htm. I like explaining things in my own way to make it more understandable, which is what I will attempt to here.

This blog post is about how to check whether a polygon is convex or not, (and subsequently, whether a polygon goes clockwise or counter-clockwise.) If you do not understand what convexity means, then I think a picture would be the best way to explain it:


As you can see from the above picture (which was taken off of the site I linked above), this polygon has concavity in all of its vertices except at (x5, y5).

If you do not understand what a clockwise vs counter-clockwise polygon is, then the following description will help: say if all of the edges of a polygon are represents as vectors pointing from one edge to the next. Did you visualize it? Are all the vectors going clockwise or counter-clockwise? That is basically the difference between a clockwise vs counter-clockwise polygon.

Now let us get back to determining the convexity of a polygon. This "isConvex" logic can be easily applicable in software engineering if you are (for example,) building an app that requires the user to manipulate some sort of polygon, though the polygon must always be convex. Using this "convexity" check will help in implementing this.

The test to check convexity of a vertex would be to check the cross product of the two adjacent sides that meet at that vertex.

A result of a cross product of two vectors would be another resultant vector that is 90 degrees from both of those two vectors. However, the vector could point either way along that axis which is  90 degrees from both of the two original vectors. By convention, we use the right hand rule to know which direction the resultant vector is pointing towards. This can be visualized in the below figure.

Figure 1

So by convention, if a is "behind" b like in this figure, then the cross product vector would go pointing up.

That said, let's say if we changed it so that the index finger in this illustration and the middle finger changed places, so that it looks something like this:

Figure 2. Like Figure 1, but where a is now pointing in b's direction, and b is pointing in a's direction.

Forgive me for my horrible drawing skills. As you can see, because the middle finger (which represents vector b) is now more "behind" the vector a (which is now more "in front/pointing towards the audience"), then that means the cross product will instead point downwards into the ground. This is just a convention mathematicians set up.

So, surprisingly enough, with this convention, we can actually test whether a polygon is convex or not. 

Figures 3 and 4

I will now describe how this test actually works. Basically, you do this test at each vertex by looking at its adjacent sides as vectors. Two vectors of a vertex is visualized above on the left-most picture (Figure 3) as the two red arrows, and the vertex is the yellow point.

You can do a cross product of these two vectors and see if the resultant cross product is going out of the page or into the page using the right hand rule.

By definition, the cross product of two vectors, both of which rely only on the x-y plane, can be simply defined as:
cross product = ((xi - xi-1),(yi - yi-1)) x ((xi+1 - xi),(yi+1 - yi)) = (xi - xi-1) * (yi+1 - yi) - (yi - yi-1) * (xi+1 - xi)
The points (Xi, Yi), (Xi-1, Yi-1), and (Xi+1, Yi+1) can all be seen in the two pictures I drew above.

Now, because vectors can be "moved" up or down on its line of direction, in the right-most picture (Figure 4), I moved the vector (that pointed from i-1 towards i) into the position that is shown using the blue arrow. I did this to make it easier to visualize how you can use the right hand rule in this test, where the thumb can be at the yellow vertex, and your middle and index fingers can be represented by the vector (i to i+1) and the blue vector.

Using the cross product equation above, we determine that the blue arrow would be vector a, and the red arrow that goes from  i to i+1 would be vector b. This corresponds with figure 2, so therefore, if you do the math for the specific example shown in Figure 4, the cross product is negative (going into the ground/page). A negative cross-product for a convex vertex (such as the one in Figure 4) means that the polygon is a clockwise polygon. This makes sense, since in figures 3 and 4, I drew the two red arrows going clockwise around the polygon!

In order to actually test for convexity versus concavity, you would have to do this similar test for all vertices of your polygon. If all vertices have the same sign, then that means your polygon is convex. If your polygon has a mixture of positive and negative cross products at the vertices, then that means the polygon must be concave at some points.

To determine exactly which points are convex or concave depends on whether you made the vectors of the polygon edges go around the polygon in a clockwise or counter-clockwise fashion. Like I said before, if your edge vectors around your polygon were clockwise, then convex points would result in negative-signed cross products. If the polygon was clockwise and you evaluate the cross product at a certain vertex which gave you a positive-signed cross product, then you can conclude that vertex would be concave.