Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The definitive guide to forms based website authentication (stackoverflow.com)
338 points by mmare on Nov 7, 2012 | hide | past | favorite | 73 comments


Well, this is weird. I created this question when StackOverflow was just out of beta, hoping to steer it to more broader questions - guides, if you wish. This question really took off, but the format didn't, and SO mostly became a stack of incredibly specific questions and answers.

And now somebody, but not me, has submitted this question to HN. Under my name. I'm puzzled...


It's not under your name, unfortunately. mdemare != mmare


But it indeed seems to be somebody who is claiming to be the author of the SO post (perhaps for karma?).

They only created the HN account today in order to post this story, and they chose a username which is obviously a reference to Michiel de Mare from the SO post.

Whereas according to mmare's profile, he's been around on HN for a good while now.

I agree that it's strange.


I'm normally highly sceptical of anything which is essentially a how to guide on security of, well, anything but I have to say whoever this author is they absolutely know their stuff.

Normally security advice is just 1980s circle-jerking of the same meaningless "sound good" concepts (e.g. "At least one upper-case, number, special character") but actually, no, not in this case.

Instead he is giving advice which is modern, which is based on how people actually use these systems, and also the common mistakes developers make while building them (e.g. not hashing forgotten password keys).

He even linked to NIST Special Publication 800-63 and THEN talked about login attempts over time. This dude is just incredible. I literally couldn't have written a better article than this.


To provide a counterpoint, the section about the "Remember Me"-cookie is rather terrible (I stopped reading after that).

It's not fundamentally flawed but rather inelegant (and potentially expensive) to store a magic number server-side for each session. You can implement the same thing more easily by handing out tamper-proof (HMAC) cookies containing the start- and end-time, and storing only the last_logout-timestamp for each user on the server-side.

Any cookie where expire_at < now or created_at < last_logout is to be rejected at validation time.


The problem for most mere mortals with the HMAC scheme is that there is a very real possibility of the server being compromised and the secret key being stolen. In this case an adversary could generate valid cookies for any user. However, with the magic number scheme as long as only the hash of the random number is stored (we should add a per-user salt too) then the entire session database could be compromised, but an attacker cannot do anything with it.

Also, though less of an issue with SSL, the HMAC approach is subject to replay attacks while expire_at < now.

EDIT: The HMAC approach also lacks any method to invalidate cookies manually, or automatically e.g. when the user changes their password. This means a compromised account is open to attack until expire_at < now, and there's nothing you can do about it other than blocking the account for that duration, which now means that each request needs to do a database lookup to see if the account is blocked. You could generate a per-user secret key, but now you have a database lookup again, so you might as well use the magic number scheme.


If your server is compromised you tend to have bigger problems than session forgery. Why would an attacker bother to fabricate http-sessions after he already gained access to your database and, in most cases, source-code?

Replay-attacks work the exact same whether you use this scheme or store a magic number server-side. There's no difference whatsoever.

For a password-change you update the last_logout timestamp and hand the user a new cookie (since his current one was just invalidated).


I think that is the beauty to this post it wasn't one person... However, Jeff Atwood seems to be the main contributor. He was also the co-founder of Stack Overflow.

http://stackoverflow.com/posts/477578/revisions

The power of team work.


He's probably incorporating information from highly rated posts below, rather than generating content.


That's what he says he did in the related meta post: http://meta.stackoverflow.com/questions/95172/old-problemati...


As a rule most security advice on stack overflow is dangerously wrong. It's just not a good topic for the site, because consensus if often wrong in such complicated question.

I don't see anything obviously wrong with this particular article (aside from challenge response or SSL choice - one should just always use SSL, and if you can't, then seek professional advice), however I am still apprehensive of the hive mind.


There was some information in there about SRP being patented that I thought was misleading. It is patented, but it's freely licensed.


The main problem with SRP being mentioned at all is that it has no meaningful security value in web application context. It makes sense when client does not entirely trust server, which makes no sense when you deliver client as bunch of .js files from the same "untrusted" server.


Very true, however, most users are willing to download such things as native programs (e.g. installing web browsers) from unauthenticated sources over unencrypted connections. If you are in a position to inject .js resources, then the user's security would be compromised anyway.

EDIT: SRP is going to be integrated into TLS soon anyway, so we might as well hold our breath for that.


2 points by sreeix 160 days ago | flag | discuss http://news.ycombinator.com/item?id=4047424

316 points by moonlighter 457 days ago | flag | comments http://news.ycombinator.com/item?id=2859234


One url is terminated by a /. HN isn't doing URL normalization (by hitting the URL and only recording what is at the end of the 302-redirect-chain)


Added a mention of/link to Mozilla Persona.

IMO, it's the easiest way to handle authentication today, fully decentralized, secure, and with nice privacy guarantees. With it, you don't have to care about user names (just use email addresses), passwords and secure storage thereof, it mostly just works (and once it'll get linked into the big email providers in December or so, almost everyone will already have an account).


how does it prevent that sniffing data issue from happening when you are not using SSL? Or you just cannot use Persona without SSL?


Presuming you're using session cookies, Persona is no less secure than any other reasonable authentication system when used without SSL.

It also has the nice property that what Persona transmits over the wire -- the proof of identity -- is only valid for 120 seconds. Sniffing it in real time would temporarily allow you to masquerade as another user on that specific site, but any sort of delay and you're locked out.

This is a huge improvement over, say, transmitting passwords, which could grant access to an account for months or years.


In the article they talk about the 500 worst passwords of all time. Here is a gist listing those passwords. https://gist.github.com/4033452

Might be useful for some of you.


As an X-Files fanboy, I was pleased to see "trustno1" on that list!


That list was more obscene than I figured it would be.


Where I work we use something simple like kerberos/basic/digest/custom http header authentication on our apps, and then put Apache with mod_auth_form in front of it (or ISA server).

I even wrote an authentication reverse proxy[1] in java in my spare time, so I can use that to publish my apps, and have SSO across all of them (until BrowserID becomes mainstream that is). This way I centralized the cookie auth problem, and don't need to care about it in every app.

[1]http://p.r0xy.it/


I hope this gains traction before it's closed as subjective or such...


it's mostly good. NIST abolished their algo for pasword entropy estimation some time ago. i do not much like any password strength tests, most of which rate any number of terrible passwords as strong. as such i think they give a false sense of security. maybe consider cracklib.

as DenisM said, always use SSL for all traffic if security matters and don't trust SO for security advice.


The only really useful password strength test would be one that said "A stock Thinkpad would be able to brute force this password in $x hours and $y minutes."

Might make people think twice about that six character password.


How about a response that says "we just googled that combination of email address and the md5 hash of that password, it's been listed in at least 7 different database disclosures, including the Gawker one, the Sony one, and 5 different pr0n site compromises. We suggest using a different password here."

;-)


That is only useful if you also specify yhe conditions under which the "cracking" takes place. Do you mean on-line password guessing? Or do you mean brute-forcing a hashed password leaked from a database?

In case 1 a lock-out policy would quite easily negate your attack. In case two the hashing algorithm used is often more important than the length and complexity of the password (up to a point of course, but that point is nowadays well beyond what's pactical for a user)


I think you're expecting too much from users. They need to know what does "brute force"ing a password mean, what's a stock Thinkpad, and why it does matter.


Regarding website authentication, I've been looking for some feedback on a new auth scheme.

Instead of using a standard password (all characters are allowed, min 5 characters, common passwords not allowed), you're able to login with a 4 digit passcode. I know someone just cringed at that thought, but the idea centralizes around improving user experience on the website.

First, all normal precautions would be taken (no common digit patterns - 1234, 1111, 2222, etc). There would also be a limit of two attempts before the passcode is reset. The reset procedure would be them receiving a new passcode via SMS, and them having to reply "yes" before the account is unblocked. The passcode is also reset every month, and a new one is sent via SMS to your phone (you can reply to change the passcode to something else).

Now for the issues I would need to address before this is even a possibility:

1) Users on the website login with their phone number, so one obvious attack would be someone cycling through all possible phone numbers with the same passcode (for example 8237). One suggestion in the article was detecting average error rates and comparing them to see if the entire website login should be throttled.

2) If someone somehow gets a hold of the database, all passcodes would be easily crackable. Now usually this would be a huge issue, but this is because normally people could use the email/password combination to login to other websites the user might use. Since they're using 4 digit passcodes, this wouldn't apply.

3) Someone could write a script to try phone number/passcode combinations until the entire website has their passcode reset, but this would fall under 1) where the error rates would exceed the normal limits and the logins would be throttled.

4) What would be an appropriate way to throttle? I mentioned it twice above, and in the article it was referring to a timeout, but the user experience of this would negate all benefits of a 4 digit passcode. Someone could keep trying combinations, and keep throttling the site every day. I could block the ip's, but what if those ip's were also sources of legitimate traffic and stopping users from logging in/signing up.

Thoughts?


Sorry, nothing personal.

But this 'new' approach feels like last decade online banking - and it wasn't a good idea at that point.

In addition: Limiting user input and forcing password resets is, in my world, directly acting against your idea of 'improving user experience'.

If I am allowed to use a password of my choosing, I'll probably come up with something that is memorable and reasonably secure (depending on the context, I admit). If you force me to follow random, voodoo rules (just digits, at least one digit and one upper-case letter, more than x but LESS THAN y chars) I'm going to sigh, come up with something like 'YeahRight123' and I'm going to add a mental note to never trust this service fully. If I'm not leaving right away, that is. Resetting a password regularly (oh.. I hate everything noticeable SOX forces upon us)? Cool, you just motivate me to make my passwort 'cool123' - 'cool234' etc. (with variations for 'clever' password checks. If I cannot keep a prefix, I'll juggle different parts and keep the same, crappy, useless, insecure password, because .. I cannot be bothered to follow arbitrary idiot rules)

Your idea follows the worst practices in terms of restricting the keyspace and auto-resetting the password at arbitrary times, starting out weak already (4 digits..).

I wouldn't sign up with some 'security' in place that follows your suggestion.


It's great to get different perspectives on the concept. I agree that enforcing rules and resets does impact user experience.

What if the user was to authenticate once via SMS (we send them a code and they enter it within a reasonable time period), and once they do, they're authenticated for an infinite amount of time. This way they don't need to remember a passcode, and just need to have their phone on them when accessing the website from a new computer - a similar experience to two factor auth.


You're now outsourcing your users security to their cell phone provider.

Was it Twitter who had their domain hijacked by someone ringing up the right telco and saying something like "my cell pone is out of action temporarily, can you please forward all calls/messages to this other number?" in a sufficiently convincing fashion to some minimum wage telco support staff, then getting a two factor auth token sent to an attacker controlled number?

I think a lot of webdevs make assumptions about SMS "security" that are quite unfounded.


Thanks for the feedback! You brought up a valid point. It's something that will become more of an issue as the website increases it's user base and we'll think of ways to address it.


Here's another article probably of interest:

http://www.itnews.com.au/News/322194,telcos-declare-sms-unsa...

"The lobby group for Australian telcos has declared that SMS technology should no longer be considered a safe means of verifying the identity of an individual during a banking transaction."


> First, all normal precautions would be taken (no common digit patterns - 1234, 1111, 2222, etc).

Why? All you are doing is further reducing an already limited key space.

This authentication scheme is bad, and you should feel bad. :)


Agreed, I'm just toying with the idea of finding the simplest way for a user to access a website securely. Haha, that's why I posted here before implementing it ;)

We'll be focusing on mobile, and the login process could be something like PayPal's mobile app where they let you login with your phone number and PIN (min 4 digits). I'm just looking for a secure way to translate that to a web app.

Something that could help - sessions could persist for an infinite amount of time, so upon first login we send them 4 random digits via SMS and if they enter it correctly they're authenticated. Basically two factor auth without the initial password.


How is this any better than passwords? 1/10000 chance of guessing correctly is huge.

Why would I want to remember a different passcode every month?

If after two failed login attempts, I must respond to an SMS before I log in, it's really easy to DOS.

Users will be confused by this new scheme. Stick with what has already been vetted in the industry.


Yeah that's true, thinking about it further a way to solve this could be allowing infinite sessions with the initial passcode being sent to you via SMS.

This way you wouldn't need to remember a different passcode each month and the login attempt issue wouldn't exist because passcodes are generated when you need to login.


Sounds expensive. I already pay ~$1/month in SMS fees for TFA on my Google account. If it cost me $0.20 every time I fatfingered my password, I would probably stop using your service. What's worse, attack #3 would cost the victim even more money, and it's one thing to get charged for your own screwups. It's another thing altogether to get charged for somebody else trying to hack you.


Hm, that's a good point. I've found that most people have a text message plan which allows for unlimited incoming text messages but we'll take that into account and make it clear to the user.


> I see multiple, severe problems with this old question from 2008 and I am tempted to delete it outright -- primarily because the most highly voted answers read more like blog rants than actual "answers".

http://meta.stackoverflow.com/questions/95172/old-problemati...


It is worth noting that Atwood is the one saying that, and I would trust a random 3rd grade student's opinion on the subject over his. Notice how he is unable to provide any actual criticism of the answer, just "I don't like it"? That is his way of saying "I don't know what I am talking about, so someone else criticize it and then I'll jump in and say I was going to say that".


It says if your going to use captcha, use reCaptcha because it is "by definition hard for ocr". I think it is completely mistaken.

Two words are shown for reCaptcha, one that is "by definition" ocr easy and one that is hard. You don't need to "solve" the one that is hard. In-fact, you can put anything for the hard one. You only need to solve the part that is "by definition" ocr easy.


Two things that stood out to me:

Given that the most common 50 passwords are known, why not reject them outright? Simply state to the user: your password is too easy to guess.

Passwords should always allow spaces in order to allow people to use easier to remember passwords, a la xkcd.

http://preshing.com/20110811/xkcd-password-generator


Quick note about CAPTCHAs... A more accurate rate is $1.50 per 1000, and that's even a tad expensive.

If you buy in bulk, it's much cheaper.

Source: Security researcher.


The first answer mentions a couple of time that any token given to the user (for remember-me login or password reset) should be hashed in the database.

Would it be possible to replace the whole storing by signing the token with some private key, so that the validity of the token can be checked without having to compare it to some stored value ?


Yes, you could use an HMAC for this, however you need to keep the private key, well... private, which in practice is not easy. If the server is compromised, an attacker could steal the secret key and use it to generate signed cookies for any user. This method is also subject to reply attacks for the duration of the token's validity, though that is less relevant with SSL.

Whereas if only token hashes are stored in the database, then the entire database could be stolen and nobody can use it to generate valid cookies.

EDIT: Also, if an account goes rogue you have no way to invalidate its cookies, so you'll have to do a lookup for each request to see if the account is blocked.


I'm jaded, but the first thing I thought when I read this was:

"If I asked this question, 5 minutes later it would be closed as subjective"


What do people think of services like https://www.loginprompt.com/? (provides logins as a service for your startup)

Isn't this sort of security something we wish we didn't have to learn? And for people who don't take the time maybe it's best to let a third-party handle it.


> Isn't this sort of security something we wish we didn't have to learn?

Absolutely. Time spent on your auth scheme is time you're not spending on building your product. (And half-assing your auth scheme generally comes back to bite people.)

That said, outsourcing it to a centralized provider may not be the best idea for business, user, or security reasons. So it's a balance.

Of course, I'm biased: I work on the Persona team at Mozilla, where we're trying to build a simple, secure, fully decentralized, and open source authentication system that fits that niche rather nicely, but the points above stand: you have to figure out the opportunity cost of your chosen solution. There's no universal answer.


100% agree with you. I love the concept of Persona, but it has a serious cold-start problem. If I could implement it and nothing else on my site, I would, but unfortunately the reality today is that most users don't know it.


I still have no idea what a "Remember me" checkbox is when I encounter one. It certainly doesn't seem to be a "keep me logged in" function. I don't know if it has something to do with form autofill, because my browser seems to do that wether it is checked or not.

Can anyone demystify this for me?


"Remember me" checkboxes and form auto-fill are unrelated.

The form auto-fill behavior depends is part of the browser UI, and can be configured in its option menu.

The a "remember me" checkbox sets an identifying cookie with a late expiry date. Until that date, and unless you log out, the web site will recognize you (the server keeps a registry of what ID number correspond to which user). No need to authenticate on connection because the cookie is sent with each request.

Without the remember me option, the expiry time is short, say 30-60 minutes, but it may be renewed as long as the user is active. If you're inactive for a longer period, the cookie will be discarded, and the site will not recognize you anymore.

When you log out, the session reference is deleted on the server, and, optionally, the cookie is cleared in the browser.


> The form auto-fill behavior depends is part of the browser UI, and can be configured in its option menu.

This can be dictated by the website as to whether or not this is allowable.


Great info! I'll try examining the cookie next time I encounter one.


Shouldn't this be called the definitive guide to session based authentication?


> if an attacker got his hands on your database, he could use the [persistent login cookie] tokens to log in to any account

If an attacker gets his hands on your database, it's kind of game-over already.


I don't mean to be rude, but you clearly don't understand this subject. Databases leak, not least of all due to human errors. Half the effort in computer security goes to preventing the leaks, and the other half goes to mitigating the consequences of such leaks. Hashing the passwords, salting the hashes, the entire md5/sha/pbkdf/bcrypt/scrypt debacle, all of these things are there only to mitigate the consequences of a database leak that is presumed to happen at some time in the future.


No, not at all. There are many scenarios in which data can be accessed read-only such as ACL misconfiguration, poorly secured backups, 0-day attacks which allow stealing cryptographic keys, overly verbose exception messages, etc.

An adversary who makes a single copy of your database could impersonate any user, and go unnoticed for potentially a huge period of time unless you have good intrusion protection. A targeted attack might steal just a single token, and could last a few seconds only, but then have unauthorised access indefinitely via the token.

EDIT: Incidentally this is why only the hash of the token should be stored in the database, just like storing passwords. Also the token should expire.


I love the attention to usability in the first answer.


I am also amazed the way StackOverflow manages such a huge knowledge base . Information is such nicely organised and unwanted content automatically gets trimmed out in the end.

It is such a beautiful product. I like it particularly for the way they broke the rules of conventional forums (Yahoo forums , Google Groups ) for technical discussions.



Its full of good info, but most of the time now, i'd just put persona and be done with it


Fantastic resource. Just what I was looking for.


Why do maximum security sites always disable auto-complete for username and password?

That seems less secure to me. If I always have to type in my password, chances are that I'll choose a password that can be easily remembered or I'll be forced to write it down somewhere.

(Personally, I use plugins to get around this anyway. My computer, my rules.)


Probably to prevent people accidentally saving a login on a shared/public computer.


The curious thing about this "solution" is that it's pretty fundamentally broken. If you're authenticating to any important site by typing a username/password into a computer you don't "trust", you're doing it wrong.

The subset of computers in between "people I don't trust also use this computer" and "this computer could easily have had a key logger Or root kit installed" must be vanishingly small.

If you don't own it (or trust the person who owns it enough to satisfy your personal security requirements), then any username/password you type into it should be considered "possibly compromised" no matter what measures the website has taken to protect you. Two factor auth helps, but still have the problem that 2/3rds of your auth credentials could be compromised (the attacker could end up knowing your gmail username & password, leaving only the six digit auth-code to brute force, which I _hope_ google have sensible protection in place for). Single use passwords also help, but both tfa and single use passwords don't protect against an attacker who 0wns the machine seeing and recording everything that happens in your current session - including I suspect for a sufficiently skilled attacker (or perhaps even a script kiddie with an off the shelf tool), complete access to the post SSL decrypted data inside a trojaned browser (if I can modify the browser, none of the httponly or secureonly flags for your session cookies are safe, sure, JavaScript can't extract them, but the browser code can… And it could be exporting them in real time to the bad guy, or piggybacking proxied instructions to empty your bank account via Western Union while you check your credit card balance)


I don't disagree, but if you run a site of any size you will quickly realize that users will do all sorts of crazy things against the best practices for security. One of the top (if not the top) search requests Google gets is "facebook login". What do you want to bet that a lot of those requests are coming from a shared computer?


Agreed that a lack of autocomplete is annoying. A savvy user can also circumvent this by using a browser plugin such as GreaseMonkey to force-fill whatever fields they want. I guess the assumption is that such a user will also be savvy enough to safe-guard their password.


And a savvy attacker can use a very similar grease monkey plugin to record/export everything that gets submitted via those fields too…


Some stupid PCI audits fail you for having autocomplete enabled. I know it's completely absurd, but try convincing an auditor that's the case.


> Why do maximum security sites always disable auto-complete for username and password?

Because they think they are the centre of the universe. Theirs is the only site that matters, and is the only one they work on, so why would anything else matter?

Thankfully a Chrome plugin turns autocomplete back on for me. But there are still some sites that go out of their way to ensure that they still won't work. For example my doctor's site is some sort of third party abomination that would have looked "cool" ten years ago and requires me opening my password safe every time. Even Google pulls some stunts on some of their authentication pages preventing autofill.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: