I can see your password from here!

Declan O’Riordan, head of security testing, T&VS

How to keep your business protected

I have no mind-reading abilities, but I’m going to make a prediction that the password you chose for your employers’ corporate website probably requires eight characters or greater, and must require three of the following four character types: upper case letters; lower case letters; numbers; special characters. Nothing unusual in that, but hackers know you probably selected the first character to be a capital letter, followed by five lower-case letters, and ending in two digits – e.g. London15. If the minimum password length is nine characters, you’ll probably add another lowercase letter at position seven – e.g. Bristol15. If instead you have to use all four character types, you’ll likely add a special character at the end – e.g. Bristo1!.

This is because instead of everyone making randomly individual choices, groups of people follow patterns of behaviour that result in predictable password topologies. Let’s work through the consequences of those password topologies for cyber security and understand why 97% of LinkedIn passwords have been ‘cracked’.
Have you ever wondered what happens to your password after it has been created? It needs to be stored in a database for comparison with the password entered every time authentication is invoked for your account, for example when you login. Some absolutely reckless organizations store and transmit those passwords ‘in the clear’ (i.e. unencrypted) which makes it incredibly easy for attackers to obtain access to every user account and cause massive damage. Since hackers make password files a priority target during a breach, every password should be ‘hashed’ i.e. converted by an algorithm (known as a cryptographic hash function) from plain text into a unique ‘hash digest’.

To authenticate a user, the password presented by the user is hashed using the same algorithm and compared with the stored hash. Hash algorithms are one-way functions. They turn any amount of data into a fixed-length ‘fingerprint’ that cannot be reversed. This approach prevents the original passwords from being retrieved if forgotten or lost. They have to be replaced with new ones.

If any part of the input changes, the resulting hash is totally different as shown below:

   hash(“hello”) = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

   hash(“hbllo”) = 58756879c05c68dfac9866712fad6a93f8146f337a69afe7dd238f3364946366

   hash(“waltz”) = c0e81794384491161f1777c232bc6bd9ec38f616560b120fda8e90f383853542

There are several ways hackers can crack the security provided by hashed passwords:

  • Dictionary attacks: Using common passwords, phrases, words and other strings such as ‘leet speak’ (“hello” becomes “h3110”) to try guessing a password, then hashing every word in the list and comparing them to the hash values in the genuine password file when the security breach is underway.
  • Brute-Force attacks: Try every possible combination of characters up to a given length. These attacks are computationally expensive and inefficient, but will always find the password eventually. The best defences make searching through all possible strings take too long to be worthwhile.
  • Lookup Tables: Pre-compute the hashes of passwords and compare them with hundreds of genuine password hashes per second. Try it against your own password by hashing it here: http://www.hashemall.com/ then using a free hash cracker https://crackstation.net/
  • Reverse Lookup Tables: Create a lookup table that maps each password hash from the compromised user account database to a list of users who had that hash, since many often have the same passwords. The attacker then hashes each password guess and uses the lookup table to get a list of users whose password was the attacker’s guess.
  • Rainbow tables: Reduce the size of lookup tables and store more hashes in the same amount of space. Saves memory but reduces cracking speed.

You might think the odds of finding matches between guessed password hashes and a database of genuine password hashes would be very long but the ‘birthday paradox’ proves that wrong. How many people do you think must be in the same room as you for the chance to be greater than even that another person has the same birthday as you? Answer = 253. How many people must be in the same room for the chance to be greater than even that at least two people share the same birthday? Answer = 23.

In the first instance you are looking for someone with a specific birthday date that matches your own. In the second instance, you are looking for any two people who share the same birthday. There is a higher probability of finding two people who share a birthday than finding another person who shares your birthday. Hackers tend to look for any two matching password hashes (their guesses and any password database hash) rather than persistently trying to brute-force one particular hash value.

A simple £2,000 password cracking tool with three high-end GPU (graphics) cards could crack all eight-character NTLM password hashes (NTLM is the Microsoft Windows NT LAN Manager authentication mechanism) in 3.7 days, all eight-character MD5 (a message-digest algorithm used as a cryptographic hash function) password hashes in 8 days, all eight-character SHA1 password hashes in 24 days (as used by LinkedIn), and all eight-character SHA256 password hashes in 64 days. However, it would take 999999999 billion years to crack all eight-character SCrypt password hashes (as recommended by OWASP but used by zero internet accounts). Clearly we have options!

As length is added, the time to crack password hashes gets longer. The same tool that could crack all eight-character MD5 password hashes in 8 days would take 750 days to crack all nine-character MD5 password hashes, 188 years to crack all ten-character MD5 password hashes, and 17,000+ years to crack all 11-character MD5 password hashes.

Now remember those predictable password topologies I mentioned in the introduction? They really shorten the odds of a cracking tool finding a matching password hash after a site is compromised because priority is given to the common topologies.

Generally, computer users have no clue what makes a password complex and therefore use simple, predictable passwords. Users typically pick the lowest-common-denominator that will be allowed by policies. Most Internet sites do not actually require users to choose complex passwords, and are a decade behind enterprises on password policy, and those enterprise policies haven’t changed for ten years. Password cracking however has progressed a great deal and makes mincemeat of many poorly defended password files.

One of the new methods is ‘Password reuse’. For example, at LinkedIn the same domain administrator password of 11 random characters had been used on LinkedIn one year previously, and had already been cracked. Hackers will reuse the same passwords they have cracked previously, including the passwords you use on other sites they have cracked. Think about your Facebook, LinkedIn, Skype, Twitter and corporate passwords. How similar are they? Hackers probably already have at least one of them.

Another rule is password generation based on previous data. People stick to mathematical ideas like putting a number at the end of the password.

In addition there is generation based upon user-base or source of a password leak (e.g. Link Linked Linkedin LinkedIn).

Finally there are pattern based (topologies) attacks using selective brute-force. Rather than testing all possible passwords, the attack targets some specific subsets, and tries all passwords that fit the pattern (topology). Since users gravitate towards certain topologies, a disproportionate number of passwords can be cracked by targeting those topologies.

In a sample of 263,888 password hashes at a Fortune 100 enterprise, 7,308 unique topologies were found. The top five patterns were used by 48% of all users. The top 100 patterns used by 85% of all users. Yet 99.9% of the passwords met their complexity requirements. Where U=uppercase letter, l=lowercase letter, and d=digit, the most popular were:

       Ullllldd (8 char) 12.7% e.g. London15

       Ulllllldd (9 char) 12.7% e.g. Bristol15

       Ullldddd – 10.6% e.g. Pass2015

       Ulllllllldd – 7.3% e.g. Password15

       Ulllldddd – 5% e.g. Nicky2015

flowfig2Armed with this insight, hackers are attacking password files and thousands of sites are suffering massive breaches. 630 sites using the MD5 hashing algorithm alone were compromised last year. Only about 1% of password hacks become public knowledge because most organizations prefer not to admit their customer data has been abused. We only know 97% of LinkedIn passwords are currently hacked because the password hashes were posted on a message board anonymously six months after they were obtained.

Many tools intended to rate password strength are very inaccurate because they ignore topologies. The password complexity meter ‘How secure is my password’ scored ‘Denver14’ as 15 hours, when actually it took researchers from KoreLogic less than two minutes to crack. However, the Kaspersky complexity meter rated ‘Denver14’ as three seconds to crack because it recognised the topology and is scripted to know how hackers and password crackers work.

We need better defences to frustrate password crackers. There are many, many, defences that can be put in place to control the threat of password cracking, but everything associated with cryptography seems hard and is often poorly understood, even by the project teams implementing solutions.

Do get in touch if you’d like T&VS to provide hands-on advice for enforcing and evolving password complexity, black-listing the most common predictable topologies and dictionary words, limiting the number of users gravitating towards the same topologies, requiring a minimum topology change between old and new passwords, password rotation, password storage, hash formats, salts, using key stretching functions, and transferring our security knowledge into your project teams to make a long-term difference: http://www.testandverification.com/solutions/security/

Free white papers on how to start building and testing secure web applications.
The purpose of these documents is to set out good practice for avoiding security vulnerabilities on any Web Application project and they include:

       – An explanation of Web Application Security Development and Testing

       – Guidelines for developers and testers to reduce the top ten application security risks

Download your free white papers now.

Free briefing / webinar on Internet Security (20 January 2015).
If your company writes or uses software connected to the Internet this briefing will inform you of the security threats you face, your responsibilities in respect of those threats and practical suggestions on how to discharge those responsibilities.

Register for the briefing / webinar here.

Blog references:
OWASP: https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet
Adriancs & Taylor Hornby: Salted Password Hashing – Doing it Right
Shon Harris: All in one CISSP
KoreLogic research: https://blog.korelogic.com/blog/2014/04/04/pathwell_topologies