Tuesday 3 December 2013

Database Isolation Levels In Layman Terms

In my career, I have had to deal with database isolation levels on a handful of occasions.

I can never remember them on top of my head, and had always resort to googling, which is fine.. but most of the time I just want to know in a sentence or two what are the isolation levels and how are they different to each other.

So here it is (in increasing level of isolation and slowness..):

READ UNCOMMITTED
This means the data you read may not have been committed.  I.e. a transaction may have done an update or insert, but it hasn't actually committed it.  I.e. you may have a "Dirty Read"


READ COMMITTED
Guarantees data read has been committed at the moment it is read.  Prevents "Dirty Read".

This has 2 options in MSSQL - snapshot on/off
On - Puts a version number on the rows you are reading, so effectively every transaction sees a SNAPSHOT of the data at the time of read.

Off - Locks the rows you are reading from being edited by another transaction.

REPEATABLE READ
In addition to Read Committed, it also guarantees that any data already read CANNOT change, even in the same transaction.  I.e. with in a transaction you can read the data any time, and you are guaranteed you that the rows you read before is UNCHANGED (but you may see MORE rows added)


SERIALIZABLE
In addition to READ COMMITTED and REPEATABLE READ, it also guarantees NO NEW data can be seen by a subsequent read.  I.e. once you did a select * on a table, no concurrent transaction can UPDATE/INSERT/DELETE on that table!

A very good example I pulled from Stack Overflow:

BEGIN TRANSACTION;
SELECT * FROM T;
WAITFOR DELAY '00:01:00'
SELECT * FROM T;
COMMIT;
  • under READ COMITTED, the second SELECT may return any data. A concurrent transaction may update the record, delete it, insert new records. The second select will always see the new data.
  • under REPEATABLE READ the second SELECT is guaranteed to see the rows that has seen at first select unchanged. New rows may be added by a concurrent transaction in that one minute, but the existing rows cannot be deleted nor changed.
  • under SERIALIZABLE reads the second select is guaranteed to see exactly the same rows as the first. No row can change, nor deleted, nor new rows could be inserted by a concurrent transaction.

Wednesday 20 November 2013

JavaScript Basics

I recently gave a short prezi presentation about some basic javascript concepts that everyone claiming to know javascript should know!

Enjoy!

Click Me for the Presentation

Friday 25 October 2013

My own implementation of MergeSort

Took me an hour to nut out ... but I did it by myself without looking at the wikipedia's implementation!

Probably not the best way to implement it ... but I don't care at the moment, just feel happy that I did it :-)

 <html>  

      <script>
           function merge(left, right){
                var result = new Array();
                while (left.length > 0 || right.length > 0){
                     var smallestElement;
                     if (left.length > 0 && right.length > 0){
                          smallestElement = left[0];
                          if (left[0] > right[0]){
                               smallestElement = right[0];
                               right.splice(0,1);
                          } else {
                               left.splice(0, 1);
                          }
                     } else if (left.length > 0){
                          smallestElement = left[0];
                          left.splice(0, 1);
                     } else {
                          smallestElement = right[0];
                          right.splice(0,1);
                     }
                     result.push(smallestElement);
                }
                return result;
           }
           function mergeSort(input){
                var length = input.length;
                var left = input.splice(0, length/2);
                var right = input;
                var leftResult;
                var rightResult;
                if (left.length > 1){
                     leftResult = mergeSort(left);
                } else {
                     leftResult = left;
                }
                if (right.length > 1){
                     rightResult = mergeSort(right);
                } else {
                     rightResult = right;
                }
                return merge(leftResult,rightResult);          
           }
           (function(){
                var testInput = new Array(6,5,3,4,2,1,8,8,9,9,10,15);
                console.log(mergeSort(testInput));
           }());
      </script>
 </html>

Monday 7 October 2013

__proto__ vs .prototype

What is __proto__?

Every object in javascript has a __proto__ property.

The __proto__ property points to object that this object inherits from.

By default an object non-function points to Object.prototype.
(Object.prototype is basically the root object, and its __proto__ is null) i.e. (Object.prototype.__proto__) is null.

By default an function object points to Function.prototype (Function.prototype this is where function inherits methods like .apply(), .call(), etc...)
Function.prototype__proto__ points to Object.prototype.

Invoking a function as a constructor
This basically just means calling a function using the keyword new.

When you do that the __proto__ of  object created becomes the functions ".prototype" which by default which will be Object.prototype.

NOTE this is not the same as the function object's __proto__ which is Function.prototype.

I.e. by default:
var y = function(){}

y.__proto__ = Function.prototype
y.prototype = Object.prototype

What is .prototype?
By setting .prototype on a function object which is going to be invoked as a constructor, we are basically setting the __proto__ property for the object that is goingto be created using this function.

To Summarize

var x = {};
// x.__proto__ = Object.prototype

var f = function(){};
// f.__proto__ = Function.prototype
// f.prototype = Object.prototype

f.prototype = x;
// f.__proto__ = Function.prototype
// f.prototype = x (duh!)

var y = new f();
// y.__proto__ = x;

Wednesday 14 August 2013

SQL - What is Sargable?

Basically a query is Sargable if it can make use of indexes to improve performance.

What makes a query non-Sargable?
The most common way of making a query non-Sargable is by reference a field in a function in the where clause of a query.

For example:

SELECT ... FROM ...
WHERE Year(myDate) = 2008
This can be better written as:
WHERE myDate >= '01-01-2008' AND myDate < '01-01-2009'

So .. moral of the story ..
Avoid using functions that references a field inside the where clause of your query.

Saturday 3 August 2013

Hibernate Flush vs Commit

Flush syncs your hibernate session state to your database, BUT it still can be rolled back.

One example of where I had to flush the database is when I use HQL to query the database expecting that some of the previous changes I made to already be in the database.

E.g.
Assuming entity.A = 0 before the hibernate session began.

1. entity.setA(10);
2. invoke HQL to getA (select A from entity .. or something like that)

Line 2 will actually return 0.  NOT 10.

entity.A is 10 in the hibernate session, but not in the database.
to sync the current state to the database but still allowing the transaction to be rolled back if something goes wrong, we can perform a flush.

1. entity.setA(10);
2. session.flush()
3. invoke HQL to getA (select A from entity .. or something like that)

Line 3 should now return 10.


How does it work?
Not really sure.. I am guessing database as an extra layer to store flushed changes, before committing..


Is Flush Isolated?
i.e. If in one hibernate transaction I perform a flush, will I be able to see the changes in SSRS?

The answer depends on the database ISOLATION level.  If it's set to READ_COMMITTED, then you wouldn't be able to see it... if it's set to READ_UNCOMMITTED then you probably can..


Warning
Flush can be expensive.. hibernate probably has to figure out all the entities which has been changed/created/deleted and then write it to the database...

I wonder if that means if a flush is called 2x consecutively will the 2nd time be really quick??

Sunday 30 June 2013

Log(n)

So why is it that if u have (n) items, you have to divide it by 2, log(n) times till you get to 1 item?


This is often the analysis needed for sort algorithms e.g. Merge sort and quick sort.

The way to understand it is to think about how many times you have to divide the list in 2 until you you have only 1 item left.

Mathematically:

n / (2^x) = 1
n = 2^x
x = log(n)

Where:
n is the number of items originally on the list
x is the number of times it takes to divide the list in 2
Log is base 2



Monday 24 June 2013

Some Security Stuff (PKI, SSL/TLS, Digital Signatures etc)

Authentication vs Authorisation
Authentication is about identity.  I.e. Proving you are who you claim to be.
For example logging in with a username and password is a simple Authentication mechanism.  The server authenticates you by checking your username and password.

Authorisation is about what you are authorized to do.  It comes AFTER authentication.  Once you have been authenticated, the system can then look up what rights and privileges you have, this is authorization.

Digital Certificate (X509 Certificate)
A digital certificate should contain information about the organisation that owns the certificate and the public key which 3rd parties can use to encrypt messages going to the organisation.

A digital certificate can be self signed or signed by a CA (Certifying Authority).  A certificate that is not signed by a CA will throw up red flags in your browser because it means the identity of this organisation has not been verified.  I.e. I can create a certificate and say that I am Google.  CA's job is to verify that identity of the certificate against the actual organisation so when we receive a certificate signed by the CA, we can trust the certificate does indeed come from who it claims to be.

A signed certificate should contain the organisation's details, the public key for encrypting messages going to that organisation, and a digital signature from the signing authority.

X509 is a PKI standard.  Part of it describes what a PKI certificate should look like.

PKI
What is PKI?
Is a system for the creation, storage, and distribution of digital certificates which are used to verify that a particular public key belongs to a certain entity. The PKI creates digital certificates which map public keys to entities, securely stores these certificates in a central repository, and revokes them if needed.



SSL/TLS

What is SSL and TLS?
TLS (Transport Layer Security) is a newer version of SSL (Secure Socket Layer).
It is essentially a Protocol for secure communication over an insecure network (i.e. a network that is public and anyone can capture your packets etc..).

How to Initiate TLS/SSL?
If a server supports communication using TLS/SSL protocol, the client have 2 ways to tell the server it wants to use it.

1. Use special port numbers.  E.g. 443 for HTTPS
2. Some other ways that i don't really care about at the moment..

What Happens when I use TLS/SSL?
As mentioned TLS/SSL is a protocol.  The protocol specifies a hand shake process as follows:


  1. Client - The client tells the server it wants to use TLS/SSL.  This is done by connecting to the server on a special port (see above on Initiate TLS/SSL).
  2. Server - The server then sends it's certificate to authenticate itself to the client. The server can also request a client to send back a client certificate for authentication purpose.
  3. Client - Upon receiving the certificate the client (e.g. your browser) can verify the certificate's authenticity.  This is done using a CA (Certifying Authority)'s public key.  A certificate certified by the CA will be signed using the CA's private key.  The browser should be able to use the CA's public key to decrypt the certificate and this will prove that the certificate has not been tampered with and these guys are indeed who they claims to be.  If the authenticity of the server cannot be verified, i.e. there certificate is not signed by a CA, then the client will be warned that a secure connection cannot be established.  If the server is authenticated, then the client will use the Public Key provided in the server's certificate encrypt a "pre-master secret (generated by the client)" and send it back to the server.  If client authentication was requested, it will send the client certificate as well.
  4. Server - Once the server receives the client certificate, it will authenticate the client (i.e. the server checks to see if this certificate is trusted).  If the client can be authenicated, it will decrypt the pre-master secret using the server's private key.  It will then use the pre-master secret to generate the master secret, encrypt it using its own private key and send it back to the client.
  5. Both client and server uses the master-secret to generate a session key - which are symmetric keys.  This will be used to encrypt/decrypt futher communications between the client and server



Tuesday 21 May 2013

CSRF (Crost Site Request Forgery)

The problem:
Someone sends you an email with a link to their own website, on that site there's a link that tries to do something malicious to another site.

E.g. One can easily create a page with the following elements:
<a href="http://facebook.com/changePassword?newPassword=iknowyourpassword">

Obviously this is just a simple example, and your clearly can't hack facebook like this, but it demonstrates exactly what CSRF (Cross Site Request Forgery) is.

What is CSRF?
CSRF is basically, a site sending request that should normally only be sent from pages originated from the same domain.  In the example above, that path should probably only ever show up from the GENUINE facebook password change page.


How do we protect against it?
Clearly it doesn't take much for someone to go to the facebook change password page, and look at the source and try to mimic the same request on their website.  So how do we protect against it?

Firstly, we need to look at 2 different categories of requests: Simple and Complex (these aren't technical terms.. but it seems to be how people are referring to it)

The answer is CORS (Cross Origin Resource Sharing)

Simple
Simple request is basically, GET or POST with certain content-type header (e.g. application/x-www-form-urlencodedmultipart/form-data, or text/plain)

When you submit a simple request, the "Origin" header will ALWAYS be included, then its up to the server to decide whether they want to allow access to the resource from this domain.

The server will also send back some headers about who can send what.... so that the client knows what it is allowed and not allowed to access eg:
Access-Control-Allow-Origin

Complex 
Complex requests (i.e. anything not simple) are "pre-flighted", this means that all modern browser when a request is Cross Domain it will first send an "OPTION" request to the server asking for permission before actually sending the request.

On the Server side

With simple requests there is NO way (using modern browsers), to add custom headers to it.

So we can easily on the server side check for the existence of a non standard header, and if its there we know for sure that its not a simple request.  If its not a simple request then we know that the request had been pre flighted which means they had already asked for and been granted permission to use this resource.

If it doesn't have a custom header, then it can be a simple request in which case you probably want to check the origin header, or use a unique token (see below).

GET requests are considered "safe" (not idempotent), because a GET request shouldn't change anything.. i.e. you don't go an change password with a GET request.  So GET requests does not need to be protected.

CSRF Token
Another way to secure your web resources is through the use of a token.
The token can be rendered on to the page served up by your site, and needs to be included when you send a request.  The server (probably using a filter or something), will always check the token ensuring that it exists and that its the correct value.  The token value can be session based, i.e. you generated once for the entire user's session.

So what if someone just sends a GET request and programmatically scans the page for your token and then constructs a link with the token??

Short answer is they can't.  Apparently, you actually cannot programmatically read the response of a Cross Domain Request)!  Your browser can render it and display it on your screen, but you can't actually write some scripts to parse it!  This means they cannot programmatically extract the token and use it to construct a link!

Custom Headers

  • You can only include the header if you're calling from JS
  • Same Origin policy prevents calling outside the domain from JS
  • If you're in the same domain, CSRF is no longer the problem since you can get the token anyway