Biz & IT —

How I became a password cracker

Cracking passwords is officially a "script kiddie" activity now.

How I became a password cracker
Aurich Lawson

At the beginning of a sunny Monday morning earlier this month, I had never cracked a password. By the end of the day, I had cracked 8,000. Even though I knew password cracking was easy, I didn't know it was ridiculously easy—well, ridiculously easy once I overcame the urge to bash my laptop with a sledgehammer and finally figured out what I was doing.

My journey into the Dark-ish Side began during a chat with our security editor, Dan Goodin, who remarked in an offhand fashion that cracking passwords was approaching entry-level "script kiddie stuff." This got me thinking, because—though I understand password cracking conceptually—I can't hack my way out of the proverbial paper bag. I'm the very definition of a "script kiddie," someone who needs the simplified and automated tools created by others to mount attacks that he couldn't manage if left to his own devices. Sure, in a moment of poor decision-making in college, I once logged into port 25 of our school's unguarded e-mail server and faked a prank message to another student—but that was the extent of my black hat activities. If cracking passwords were truly a script kiddie activity, I was perfectly placed to test that assertion.

It sounded like an interesting challenge. Could I, using only free tools and the resources of the Internet, successfully:

  1. Find a set of passwords to crack
  2. Find a password cracker
  3. Find a set of high-quality wordlists and
  4. Get them all running on commodity laptop hardware in order to
  5. Successfully crack at least one password
  6. In less than a day of work?

I could. And I walked away from the experiment with a visceral sense of password fragility. Watching your own password fall in less than a second is the sort of online security lesson everyone should learn at least once—and it provides a free education in how to build a better password.

My not-particularly-l33t cracking setup: a 2012 Core i5 MacBook Air and a Terminal window. The five columns of text in the Terminal window are a small subset of the hashes I cracked by day's end.
Enlarge / My not-particularly-l33t cracking setup: a 2012 Core i5 MacBook Air and a Terminal window. The five columns of text in the Terminal window are a small subset of the hashes I cracked by day's end.

“Password recovery”

And so, with a cup of tea steaming on my desk, my e-mail client closed, and some Arvo Pärt playing through my headphone, I began my experiment. First I would need a list of passwords to crack. Where would I possibly find one?

Trick question. This is the Internet, so such material is practically lying around, like a shiny coin in the gutter, just begging you to reach down and pick it up. Password breaches are legion, and entire forums exist for the sole purpose of sharing the breached information and asking for assistance in cracking it.

Dan suggested that, in the interest of helping me get up to speed with password cracking, I start with one particular easy-to-use forum and that I begin with "unsalted" MD5-hashed passwords, which are straightforward to crack. And then he left me to my own devices. I picked a 15,000-password file called MD5.txt, downloaded it, and moved on to picking a password cracker.

Password cracking isn't done by trying to log in to, say, a bank's website millions of times; websites generally don't allow many wrong guesses, and the process would be unbearably slow even if it were possible. The cracks always take place offline after people obtain long lists of "hashed" passwords, often through hacking (but sometimes through legal means such as a security audit or when a business user forgets the password he used to encrypt an important document).

Hashing involves taking each user's password and running it through a one-way mathematical function, which generates a unique string of numbers and letters called the hash. Hashing makes it difficult for an attacker to move from hash back to password, and it therefore allows websites to safely (or "safely," in many cases) store passwords without simply keeping a plain list of them. When a user enters a password online in an attempt to log in to some service, the system hashes the password and compares it to the user's stored, pre-hashed password; if the two are an exact match, the user has entered the correct password.

For instance, hashing the password "arstechnica" with the MD5 algorithm produces the hash c915e95033e8c69ada58eb784a98b2ed. Even minor changes to the initial password produce completely different results; "ArsTechnica" (with two uppercase letters) becomes 1d9a3f8172b01328de5acba20563408e after hashing. Nothing about that second hash suggests that I am "close" to finding the right answer; password guesses are either exactly right or fail completely.

Prominent password crackers with names like John the Ripper and Hashcat work on the same principle, but they automate the process of generating attempted passwords and can hash billions of guesses a minute. Though I was aware of these tools, I had never used one of them; the only concrete information I had was that Hashcat was blindingly fast. This sounded perfect for my needs, because I was determined to crack passwords using only a pair of commodity laptops I had on hand—a year-old Core i5 MacBook Air and an ancient Core 2 Duo Dell machine running Windows. After all, I was a script kiddie—why would I have access to anything more?

I started on the MacBook Air, which meant that I had got to use the 64-bit, command-line version of Hashcat rather than the Windows graphical interface. Now, far be it from me to sling mud at command line lovers, who like to tell me endless stories about how they can pipe sed through awk and then grep the whole thing about 50 times more quickly than those poor schlubs clicking their mice on pretty icons and menus. I believe them, but I still prefer a GUI when trying to figure out the many options of a complex new program—and Hashcat certainly fit the bill.

Still, this was for science, so I downloaded Hashcat and jumped into Terminal. Hashcat doesn't include a manual, and I found no obvious tutorial (the program does have a wiki, as I learned later). Hashcat's own help output isn't the model of clarity one might hope for, but the basics were clear enough. I had to instruct the program which attack method to use, then I had to tell it which algorithm to use for hashing, and then I had to point it at my MD5.txt file of hashes. I could also assign "rules," and there were quite a few options to do with creating masks. Oh, and wordlists—they were an important part of the process, too. Without a GUI and without much in the way of instruction, getting Hashcat to run took the best part of a frustrating hour spent tweaking lines like this:

./hashcat-cli64.app MD5.txt -a 3 -m 0 -r perfect.rule

The above line was my attempt to run Hashcat against my MD5.txt collection of hashes using attack mode 3 ("brute force") and hashing method 0 (MD5) while applying the "perfect.rule" variations. This turned out to be badly misguided. For one thing, as I later learned, I had managed to parse the syntax of the command line incorrectly and had the "MD5.txt" entry in the wrong spot. And brute force attacks don't accept rules, which only operate on wordlists—though they do require a host of other options involving masks and minimum/maximum password lengths.

This was a bit much to muddle through with command-line switches. I embraced my full script kiddie-ness and switched to the Windows laptop, where I installed Hashcat and its separate graphical front end. With all options accessible by checkboxes and dropdowns, I could both see what I needed to configure and could do so without generating the proper command line syntax myself. Now, I was gonna crack some hashes!

Could an aging Dell laptop make me a "hashkiller"?
Enlarge / Could an aging Dell laptop make me a "hashkiller"?

The first hit

I began with attack mode 0 ("straight"), which takes text entries from a wordlist file, hashes them, and tries to match them against the password hashes. This failed until I realized that Hashcat came with no built-in worldlist of any kind (John the Ripper does come with a default 4.1 million entry wordlist); nothing was going to happen unless I went out and found one. Fortunately, I knew from reading Dan's 2012 feature on password cracking that the biggest, baddest wordlist out there had come from a hacked gaming company called RockYou. In 2009, RockYou lost a list of 14.5 million unique passwords to hackers.

As Dan put it in his piece, "In the RockYou aftermath, everything changed. Gone were word lists compiled from Webster's and other dictionaries that were then modified in hopes of mimicking the words people actually used to access their e-mail and other online services. In their place went a single collection of letters, numbers, and symbols—including everything from pet names to cartoon characters—that would seed future password attacks." Forget speculation—RockYou gave us a list of actual passwords picked by actual people.

Finding the RockYou file was the work of three minutes. I pointed Hashcat to the file and let it rip against my 15,000 hashes. It ran—and cracked nothing at all.

At this point, sick of trying to puzzle out best practices by myself, I looked online for examples of people putting Hashcat through its paces, and so ended up reading a post by Robert David Graham of Errata Security. In 2012, Graham was attempting to crack some of the 6.5 million hashes released as part of an infamous hack of social network LinkedIn, he was using Hashcat to do it, and he was documenting the entire process on his corporate blog. Bingo.

He began by trying the same first step I had tried—running the complete RockYou password list against the 6.5 million hashes—so I knew I had been on the right track. As in my attempt, Graham's straightforward dictionary attack failed to produce many results, identifying only 93 passwords. Whoever had hacked LinkedIn, it appeared, had already run such common attacks against the collection of hashes and had removed those that were simple to find; everything that was left presumably would take more work to uncover.

Channel Ars Technica