Opened 6 years ago

Last modified 5 weeks ago

#180 assigned Bug / Defect

OpenVPN won't send AUTH_FAILED if client-connect plugin exited successfully but script not

Reported by: ValdikSS Owned by: dazo
Priority: major Milestone: release 2.3.15
Component: Generic / unclassified Version: OpenVPN 2.2.1 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords: radius scripts
Cc:

Description

I have radiusplugin and client-connect script. Radiusplugin is used only for accounting and some kind of authorization is managed by client-connect script. If you disable radiusplugin and client-connect script will exit with error code 1, client would be disconnected with AUTH_FAILED, but if radiusplugin client-connect function returns 0 and script returns 1, client would constantly send PUSH_REQUESTs and nothing more.
It can be fixed running client-connect script before plugin's client-connect. Patch included.

Attachments (1)

openvpn-auth_failed.diff (623 bytes) - added by ValdikSS 2 years ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 5 years ago by samuli

  • Keywords radius scripts added

I have difficulty understanding what you mean exactly... Can you share your client-connect script? Also, which radiousplugin are you using?

comment:2 Changed 4 years ago by ValdikSS

Sorry for such a slow reply.
I'm using http://www.nongnu.org/radiusplugin/ this radiusplugin and user authorization by client-connect script. So, when the client is connecting, radiusplugin connects to radius server. If it fails to connect, client gets disconnected with AUTH_FAILED. But if radiusplugin successfully connected with radius server, then client-connect script checks user authorization. If it fails (returns 1), then client got disconnected on the server side, but no disconnection message is pushed to the client, that's why client just waits for the packet for ~30 sec and tried to reconnect in a loop.
If you disable radiusplugin, client-connect script works just as it should.

comment:3 Changed 4 years ago by cron2

  • Owner set to dazo
  • Status changed from new to assigned

David, can you look into this?

comment:4 Changed 3 years ago by syzzer

This seems to be a duplicate of #179. This ticket contains more details, so closing #179 as duplicate. #179 contained the following comment from cron2:

Quoting cron2:

Wouldn't "just swap the order the things are run" break it for someone else? What if you change the order, client-connect script succeeds but plugin connect fails?

No, a proper patch would need to take the return code of both into account, requiring both to succeed (if both are enabled).

(Uh, your text says "patch included", but there is none attached...)

comment:5 Changed 3 years ago by cron2

Maybe also the same issue: #521

comment:6 Changed 3 years ago by samuli

  • Milestone set to release 2.5

comment:7 Changed 3 years ago by cron2

  • Milestone changed from release 2.5 to release 2.3.8

comment:8 Changed 2 years ago by cron2

  • Milestone changed from release 2.3.8 to release 2.3.9

dazo... if you could have a look...?

Changed 2 years ago by ValdikSS

comment:9 Changed 2 years ago by ValdikSS

So we have a successful authentication counter in multi.c. First goes plugin authentication and then client-connect script authentication. If this counter (cc_succeeded_count) is more than 1 but last authentication failed (cc_succeeded is set to false), then mi->context.c2.context_auth is set to CAS_PARTIAL instead of CAS_FAILED which is not handled by push.c and client never gets AUTH_FAILED.

I don't know what CAS_PARTIAL is there for. We can drop CAS_PARTIAL and always use CAS_FAILED there or add CAS_PARTIAL handler for AUTH_FAILED in push.c. The patch for latter is in attachment.

comment:10 Changed 2 years ago by dazo

CAS_PARTIAL means that at least one of the additional "authentication" layers succeeded. To my understanding, this isn't the username/password auth or TLS certificate authentication, but --client-connect or the similar hooks in a plug-in. IIRC, the management interface can also be used at this stage to control client connections. CAS_FAILED will also happen if you have a CCD config which contains the 'disabled' option.

So CAS_PARTIAL means that we're not quite happy with all the checks related to establishing a client connection. And it will only happen if we do not have an explicit cc_succeeded == true situation (that is, no checks have failed). CAS_PARTIAL will be set if cc_succeeded == false and cc_succeed_count > 0, otherwise it is set to CAS_FAILED.

But there are many interesting code paths leading to these three scenarios. For example plugins with with both CLIENT_CONNECT and CLIENT_CONNECT_V2 will both be run, but if either of them fail a --client-connect script will not be run. The 'disable' option in a CCD config will regardless overrule any results from previous checks.

I am not yet convinced that CAS_PARTIAL should result in AUTH_FAIL. I need to be 100% sure there are no situations where you want the client to do a reconnection, as in this scenario CAS_PARTIAL is useful. Making CAS_PARTIAL behave like CAS_FAILED will change the behaviour from what we have today, that may break some setups.

Also consider that --client-connect (or the related --plugin hook) isn't really aimed for authentication, but for creating a dynamic client configuration. Which means returning AUTH_FAILED is wrong when --client-connect fails. The authentication should happen in --tls-verify or --auth-user-pass-verify (or related --plugin hooks). What complicates this separation is that you can disable a client through the 'disabled' option in a CCD, where it probably should return AUTH_FAILED.

So my understanding is that if you want a client to "shut up and close the connection" when --client-connect does not succeed, it should on the server side propagate that failed result back to either a the --tls-verify or --auth-user-pass-verify authentication layers. That will happen if CAS_PARTIAL is the result, the client will then reconnect and the TLS layer (which takes care of --tls-verify and --auth-user-pass-verify) should then ensure to return the failure from the last attempt - which in the end results in the client receiving AUTH_FAILED.

comment:11 Changed 2 years ago by ecrist

I'm fairly certain this is a different bug that what I reported in #521, but it does appear to be along the same lines of logic. The end result seems to be that the client authentication or authorization has failed and the client attempting to connect is never actually made aware of the situation.

I think, also, for CAS_PARTIAL to really be useful, some message would need to be passed to the client to know what that actually means. I'm not really certain how we'd go about that, however.

comment:12 Changed 2 years ago by cron2

Under the hood, it's the same cause that breaks #521 - "if some bits succeeded and others fail, we end up at the end of the function with cc_succeeded=false and cc_succeeded_count>0, which leads to CAS_PARTIAL" - and all the rest of the code has no idea what to do with it.

Since the code predates my involvement by lenght, I've asked James for advice - and tend to kick out everything related to CAS_PARTIAL :-)

Dazo: --client-connect failure is documented (somewhere) to lead to client login failure, and that can indeed be useful - like, failure to access some sort of database that you *need* to handle this, etc. - so if we want to keep CAS_PARTIAL for this case, we should clearly define *when* this state should happen, and what should be *done* about it (like: send control-channel RESTART?) - but the way it is now with "we set CAS_PARTIAL but all the rest of the code will then just ignore this client, and not tell it that it's dead either" is non-useful - and I wouldn't really concern courselves about compatibility here, as this is not serving any useful purpose, having dead clients hang around until they time out. But let's see what James says.

comment:13 Changed 2 years ago by cron2

The more I think about it, the more I come to the conclusion that the current implementation of CAS_PARTIAL is at least violating POLA, and at best not serving any useful purpose - so it's complexity without a clear use case, and should just go...

(Since we short-circuit calling authentication modules, and stop when the first module is unhappy, the result of "one fails, one succeeds" can be CAS_PARTIAL or CAS_FAIL, depending on calling order - is that what we want? why?)

comment:14 Changed 2 years ago by cron2

  • Milestone changed from release 2.3.9 to release 2.3.11

David and James discussed this and indeed we could do something useful with CAS_PARTIAL (return not AUTH FAILED but HALT, which tells the client "auth succeeded but something else refused access"). I'm not sure we'll see a patch for this in the next few days, so bumping the milestone to 2.3.11

comment:15 Changed 20 months ago by samuli

  • Milestone changed from release 2.3.11 to release 2.3.12

comment:16 Changed 14 months ago by cron2

  • Milestone changed from release 2.3.12 to release 2.3.15

David...? I assume this won't happen any time soon?

comment:17 Changed 5 weeks ago by cbabs

Is there any word on this? All I have to do is have 'client-connect ' and cmd on any kind and it causes it to fail. I need to add host routes to the kernel table to redis into BGP and this is killing us. Thanks!

Note: See TracTickets for help on using tickets.