Saturday, November 25, 2023

Podman pitfalls

Fedora docs are your friend

https://docs.fedoraproject.org/en-US/fedora-coreos/

 

SELinux might be on

If you are having permission denied errors watch out for SELinux. Check your podman VM and verify /etc/selinux/config . You can consider switching to permissive mode + reboot

Certificate errors

MITM

Some company like to or must inspect their users traffic. Generally this is done by having a transparent proxy which terminates SSL/TLS and uses a self-signed certificate that is owned by the company and can be considered trusted. The default podman VM won´t trust this certificate. You can try the following:

COPY the PEM file to /etc/pki/ca-trust/source/anchors/ and then update the trust:
update-ca-trust force-enable && update-ca-trust extract

Time drift

If the podman VM has time drift this can also break SSL/TLS certificate verification. Just update the time of your VM.

Allow docker in podman

sudo rpm-ostree install podman-docker

Kubernetes cheat sheet

Just some commands that had value at some time or another.

Resources

All resources in a namespace

Just iterate over the resource type and look for them:
 
for i in `kubectl api-resources --verbs list --namespaced -o name`; do kubectl get --sho-kind --ignore-not-found $i; done

Which pods still have persistent volume claim

kubectl get pods --all-namespaces -o=json | jq -c '.items[] | {name: .metadata.name, namespace: .metadata.namespace, claimName:.spec.volumes[] | select (has ("persistentVolumeClaim") ).persistentVolumeClaim.claimName }'

Networking

Jump portals

In order to do this you'd need to be able to exec into pods and make sure socat is available on the pod. When that is possible it is possible to tunnel via the pod towards a target.

On the pod setup a tunnel to remote endpoint:

socat tcp-l:<local-port>,fork,reuseaddr tcp:<target-host>:<target-port>
kubectl port-forward pod/<jump-pod> <local-port>:<target-port>

resources:
- socat commad list: https://exploit-notes.hdks.org/exploit/network/port-forwarding/port-forwarding-with-socat/
- k8s port-forward docs: https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/


Thursday, September 1, 2022

OAuth 2.0 notes

 This post is by no means meant to be original just some notes to persist info acquired while digesting oauth2.0/openid connect articles. Use at your own risk. An attempt was made at keeping pointers to the sources.


rfc6749



The authorization code grant type is used to obtain both access
   tokens and refresh tokens and is optimized for confidential clients.
   Since this is a redirection-based flow, the client must be capable of
   interacting with the resource owner's user-agent (typically a web
   browser) and capable of receiving incoming requests (via redirection)
   from the authorization server.

     +----------+
     | Resource |
     |   Owner  |
     |          |
     +----------+
          ^
          |
         (B)
     +----|-----+          Client Identifier      +---------------+
     |         -+----(A)-- & Redirection URI ---->|               |
     |  User-   |                                 | Authorization |
     |  Agent  -+----(B)-- User authenticates --->|     Server    |
     |          |                                 |               |
     |         -+----(C)-- Authorization Code ---<|               |
     +-|----|---+                                 +---------------+
       |    |                                         ^      v
      (A)  (C)                                        |      |
       |    |                                         |      |
       ^    v                                         |      |
     +---------+                                      |      |
     |         |>---(D)-- Authorization Code ---------'      |
     |  Client |          & Redirection URI                  |
     |         |                                             |
     |         |<---(E)----- Access Token -------------------'
     +---------+       (w/ Optional Refresh Token) 
https://www.rfc-editor.org/rfc/rfc6749#section-4.1

AWS Cognito's authorization code grant:

https://aws.amazon.com/blogs/mobile/understanding-amazon-cognito-user-pool-oauth-2-0-grants/
Cognito comes by default with an auth app which gets hosted on an URI with a chosen domain name:
https://<domain-name>.auth.<region>.amazoncognito.com
In there you have the different endpoints for your authn/authz flows which are documented on 
https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-userpools-server-contract-reference.html
 
 

Verification of JWT tokens from Cognito

The key information for verification depends on the user pool and can be retrieved from:
https://cognito-idp.Region.amazonaws.com/your_user_pool_ID/.well-known/jwks.json . For details
see the knowledge-center article https://aws.amazon.com/premiumsupport/knowledge-center/decode-verify-cognito-json-token/ 
 

ALB authn/authz

https://docs.aws.amazon.com/elasticloadbalancing/latest/application/listener-authenticate-users.html
The loadbalancer sets the following headers:
  • x-amzn-oidc-accesstoken

The access token from the token endpoint, in plain text.

  • x-amzn-oidc-identity

The subject field (sub) from the user info endpoint, in plain text.

  • x-amzn-oidc-data

The user claims, in JSON web tokens (JWT) format.

 

Miscellaneous notes

    •  redirect_uri's have to match
    • state is also used to avoid cross-side request forgery attacks
 

Sunday, January 23, 2022

Authentication via Belgian eID card with Mozilla Firefox on Linux

Since 2003 the Belgian government issues electronic identity cards called "eID" cards. These cards have cryptographic keys on them that allow digital signatures. This enables you to authenticate to government e-services using a smart card reader, your eID and pin (which you have to set when you receive your eID card). While setting up the software on Windows is straight forward I found the process for Linux less trivial hence I like to documented it.

So if you want to use your eID card to login to Belgian governmental services on Linux you can use this as a guide. My setup is a Linux Mind 20.3 "Una" which was not considered a support OS at time of writing (20.2 "Uma" was) so these instructions should not be limited to supported OS-es.

The well documented part

In order to be able to authenticate using an eID you need some middleware on your computer. This middleware provides the interface between your browser plugin and your eID in the smartcard reader. This section describes the installation of this middleware.

 

This piece of software is available via https://eid.belgium.be and at time of writing the Linux version can be found at https://eid.belgium.be/en/linux-eid-software-installatie . I do appreciate that they have tried to support a variety popular Linux distributions but what I respect most is their decision to have it available as opensource. In my case I used a newer release of Mint and therefore I downloaded the tarbal from the link under "Downloads for unsupported distributions". Surprisingly they don't mention their github repo which has a better README.md. So I'd advise to start there: https://github.com/Fedict/eid-mw or either just check out the instructions and use it to install the code packaged in the tarbal.

The github page details the pre-requisites. For my distribution I was able to install them using:

 sudo apt-get install libtool autoconf automake libassuan-dev autoconf-archive libpcsclite-dev libgtk-3-dev libcurl4-openssl-dev libproxy-dev libxml2 libp11-kit-dev openssl
 

Once you have the pre-requisites installed you can just use the commands from the github README.md to install the middleware:

 autoreconf -i
 ./configure  
 make 
 sudo make install

After this the middleware is installed. It comes with a binary application which you can run via the command `eid-viewer` and use to verify it is working correctly.

No card reader found

At this stage the application opened but the status bar at the bottom stated "No cardreader found" even when my cardreader was connected. Generally when a card reader is connected but no eID is inserted it should read "Ready to read identity card" but when testing just make sure to insert an eID card because the card reader might require this. In my case however I was stuck at "no cardreader found".

The most likely explanation was that Linux didn't know how to communicate with my card reader and that it requires a device driver. Since it is a card reader that has many years I didn't have the documentation that came with it and the device itself only has a label "Digipass by vasco" which is too little to get a specific driver.

Fortunately Linux can help here. Since my smart cardreader is a USB device I can list all the USB devices and get there VendorId and ProductId using lsusb:

 lsusb
 ...
 Bus 003 Device 006: ID 1a44:0001 VASCO Data Security International Digipass
 ...
 

When you Google this you arrive at the useful linux-hardware.org website (specifically https://linux-hardware.org/?id=usb:1a44-0001). There they link to https://salsa.debian.org/rousseau/CCID which is a driver that works with quite a lot of models. This time I found the instructions on the website more useful: (i.e. https://ccid.apdu.fr/ ). It also seems that you can do a `sudo apt-get install libccid` but I only saw this package afterwards. In my case I have compiled from source.

If at this stage you still aren't able to check your card via `eid-viewer` then make sure to install `pcscd` which is a daemon that allows access to a smart card. You'll need it anyway at a later point. After installing you can also start the corresponding service:

 sudo apt-get install pcscd
 sudo service pcscd start

Hopefully at this stage you can open `eid-viewer` and see your card details.

Now make it work on firefox

There is an official extension for firefox to make it work with firefox which is available at https://addons.mozilla.org/nl/firefox/addon/belgium-eid/ .

However for me it always would notify me at startup with "A recent enough version of the Belgian electronic identity card software could not be found. Is the eID software installed and up-to-date?".

I know the software is up-to-date since I have installed the latest version. This part required a bit more research so I'll split up in 2 sections; the explanation and the solution so you can skip to the solution if you are only interesting in getting yours working. If the solution doesn't work for you the explanation can give insights in how it works to aid troubleshooting on your end.

The explanation

Whenever you get stuck you need to troubleshoot. One tool for troubleshooting is debugging. Recent versions of Firefox allows to debug extensions out of the box. If you browse to "about:debugging" then you get a debugging screen for Firefox, when you select "This Firefox"then you can see all the extensions that are currently installed. When you click the "inspect" button for the "eID Belgium" plugin you'll get a screen (if you've used developer tools from Firefox or Chrome this looks very familiar). There is a debugger tab in which you can find the main thread which runs "background-script.js". This allows you to see what the plugin actually does.  In this particular case it tries to install a pkcs11 security module. When that fails it shows the notification we saw earlier. 

Since automatic installation doesn't work I thought let's do it manually and I followed instruction from https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/PKCS11/Module_Installation and after I added the module I could see in security devices that Firefox can even see the attached card reader. Full of optimism I try a test login via https://iamapps.belgium.be/tma/?lang=en leading to a disappointing failed login.

The Firefox window to manage security devices gives very little configuration options but the issue at hand here is that the Firefox extension "eID Belgium" is not allowed to use the pkcs11 module that I imported manually. If you've tried creating this security module manually you'd want to delete it again because although it shows the cardreader it won't be usable and it will give rise to name conflict as you can only have one module named "beidpkcs11".

The proper way to go about it is via a native manifest: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Native_manifests#manifest_location . This manifest was actually built but it was put on /usr/local/lib/mozilla/pkcs11-modules/beidpkcs11.json which seems not checked by Firefox (perhaps in the past it was . There is also an alternate version that acts as a fallback /usr/local/lib/mozilla/pkcs11-modules/beidpkcs11_alt.json . This fallback uses p11-kit but in my case the first version works perfectly.

The solution

Put the generated manifest on the location where Firefox expects it:

 sudo mkdir -p /usr/lib/mozilla/pkcs11-modules/
 sudo ln -s /usr/local/lib/mozilla/pkcs11-modules/beidpkcs11.json /usr/lib/mozilla/pkcs11-modules/beidpkcs11.json

Restart firefox and you should no longer see the notification pop-up and ideally the lights of your card reader are even blinking.

Note according to  the docs the following 3 locations should be valid:

  1. /usr/lib/mozilla/pkcs11-modules/beidpkcs11.json
  2. /usr/lib64/mozilla/pkcs11-modules/beidpkcs11.json
  3.  ~/.mozilla/pkcs11-modules/beidpkcs11.json 

For me option 2 didn't work even though I'm on Firefox 96.0.1 (64-bit). Option 1 & 3 both worked. So use 1 if you want it to be system-wide or option 3 if you only want it to be available for a specific user.

The final test

Now go to https://iamapps.belgium.be/tma/?lang=en and you should be able to authenticate successfully!


Finally you can start the fun part; the filing of your taxes! Or whatever you wanted to do.


Edit: I wanted to create a PR to add the native manifest but it seems somebody has beaten me to it (with this change). So likely the next version will be easier to install.

Friday, December 3, 2021

Serialization Anomily when using MVCC (The good, the bad & the ugly)

Intro

When working in-depth with databases sooner or later you will encounter transaction isolation levels. If you are new to transaction isolation check out the Postgress transaction isolation documentation because that documentation team did an amazing job!  In this post I'd like to go into more depth on one issue mentioned on that page the serialization anomaly.

Real life example

A picture says more than a thousands words so let's try to create a mental picture by talking about a real-life scenario. For this assume we are creating an application that has to manage financial accounts.

Let's take as a starting point 3 accounts with the following balances:

  • Account A: €20
  • Account B: €30
  • Account C: €40

Now consider 2 transaction:

  • tx1: transfers money (€10) from account A to a account B
  • tx2: transfers money (€10) from account A to C

In order to perform these transactions against our database engine it is required to read the balance of the sender from the balances table , subtract the amount sent and write the new balance. Subsequently it needs to read the balance of the receiver, add the amount and write the receiver's new balance.

When these transactions happen in a non-overlapping mode performing these changes to your data is almost trivial. We can detail the steps as follows: Transaction 1 is in bold and transaction 2 in italics. Then the statements look like:

Begin - RA - WA - RB - WB - end - begin - RA - WA - RC - WC - end
 

So in detail:

  • begin: begins the first transaction
  • RA: reads balance for account A 20 euro
  • WA: write the new balance of 10 Euro for account A
  • RB: reads balance for account B 30 euro
  • WB: write new balance of 40 euro for account B
  • end: ends the first transaction
  • begin: begins the second transaction
  • RA: reads balance for account A 10 euro (since previous change is committed)
  • WA: writes the new balance of 0 euro for account A
  • RC: reads balance for account C 40 euro
  • WC: writes the new balance of 50 euro for account C
  • end: ends the second transaction

These steps should be intuitive and feel familiar as it is the behavior we expect when performing financial transactions in real life.

The reason why this is easy is because concurrency in this example always 1, a single transaction is in the system, so no overlap is taking place.


The interesting world of concurrency

 Just like a puzzle becomes more interesting when getting to more than 1 piece concurrency becomes interesting when having multiple concurrent actors.

A modern database however will have a few tricks up its sleeves to manage concurrency. Locking is probably the best-known mechanism to deal with concurrency. It can for example avoid data corruption due to multiple processes writing the same record at the same time. Locking is very powerful but is generally not the first choice to deal with concurrency because it causes contention which lowers the throughput of your database system. Which brings us to another database trick; Multiversion Concurrency Control, MVCC in short.

MVCC is an implementation where each transaction will get a snapshot of the database state at transaction start. This allows the DB engine to be able to read data without risking a dirty read. So it avoids reading of uncommitted data even when that data is changed by another concurrent process. It also enables repeatable reads. It can however cause Serialization anomalies which would make scenario's possible that don't make sense in real life. In order to understand why this is a problem let's revisit our 2 example transactions. Let's start with our system initialized again on the starting point detailed earlier but this time let's have the transactions executed concurrently such that the different statements within them are interleaved.

Assume the order:

Begin - RA - begin - RA - WA - RB - WB - end - WA - RC - WC - end

So in detail:

  • begin: begins the first transaction
  • RA: reads balance for account A 20 euro
  • begin: begins the second transaction
  • RA: reads balance for account A 20 euro (since previous change is not committed and when transaction started the snapshot of the database had still the original balance of 20 euro for account A)
  • WA: write the new balance of 10 Euro for account A
  • RB: reads balance for account B 30 euro
  • WB: write new balance of 40 euro for account B
  • end: ends the first transaction
  • WA: writes the new balance of 10 euro for account A (since this transaction also read the 20 euro starting balance due to MVCC and subtracted 10 from it)
  • RC: reads balance for account C 40 euro
  • WC: writes the new balance of 50 euro for account C
  • end: ends the second transaction


This is a serialization anomaly and this example hopefully illustrated why this is a problem. If it is unclear imagine you are a bank managing these accounts.  Your system has just generated money for your customer who could withdraw or use that money! The above scenario is well possible if you picture an account with multiple users. Two different users could easily wire money to different accounts concurrently around the same time. So for these type of workloads it is important to avoid serialization anomalies.

Serializable isolation to the Rescue

It is possible to avoid serialization anomalies in database engines that allow running with the strictest transaction isolation level; serializable isolation.

Serializable isolation states that in order for a concurrent execution of transactions to be valid a serial ordering of transactions has to exist such that each statement execution has the same results as in the concurrent execution.

So if we start from our concurrent execution in our example. Since there are 2 transactions we have 2 possible orderings:

  • Tx1 followed by Tx2
  • Tx2 followed by Tx1

Or in our per statement illustration:

  • begin - RA - WA - RB - WB - end - beginRA - WA - RCWC - end
  • beginRAWA - RC - WCend - begin - RA - WA - RB - WB - end

I leave it to the reader to detail the outcomes of the second ordering. You should arrive to the conclusion that account A will always have a balance of 0 after both transactions finish no matter how you order these 2 transactions (serially). Therefore if you have a database engine running in transaction level serializable isolation and you try to do the concurrent execution then that engine will give an error. This does not mean there is an issue in the Database engine but rather that there is a problem with your workload. If it occurs rarely it's possible to just have your application catch these exceptions and retry. Because when this error is thrown generally the database engine will rollback 1 of the transactions, often the transaction that submitted the statement that would give rise to the anomaly.

How can the DB know?

There is no such thing as magic (not even in the database world) so the engine must have some clever way of detecting that a statement would cause a serialization anomaly. Working backwards from the behavior of serially ordered transactions you could come op with 2 rules:

  1. If data is read in a transaction TX that was written by another transaction  TC which was committed before the start of Tx then in a sequential ordering Tx must follow TC (order: TC -> TX)
  2. If data is read in a transaction TY that is written by another transaction that overlaps but wasn't committed when TY started then Ty must precede TU (order TY -> TU) 
 Note that
  • Data read and written within the same transaction doesn't impose ordering
  • 2 Reads from different transactions also don't impose an ordering
  • Since more than 2 transactions can overlap a read-only transaction could still give rise to a serialization anomaly!

In concurrent executions that are a serialization anomaly you will get ordering rules that will be in conflict. For our example:

Begin - RA - begin - RA - WA - RB - WB - end - WA - RC - WC - end

  • begin: begins the first transaction
  • RA: reads balance for account A
  • begin: begins the second transaction
  • RA: reads balance for account A 
    • 2 reads don't impose ordering between each other
  • WA: write the new balance of account A
    • At this point the DB knows that T2 reads data written to by an overlapping transaction T1 which is uncommitted so T2 must precede T1 (T2 -> T1 )
  • RB: reads balance for account B
  • WB: write new balance for account B
  • end: ends the first transaction
  • WA: writes the new balance for account A 
    • Now the DB knows that T1 reads data written by T2 which was not committed at starttime of T1 so T1 has to precede T2 (T1 -> T2)

 At this stage the DB engine can throw an exception as no matter what happens further with T2 (except for a transaction rollback) it will yield a serialization anomaly. Therefore it is best to rollback the transaction immediately and don't allow further statements as their results shouldn't be relied upon anyway.

The advantage of the arrow notation is that you can chain them together and as soon as you encounter a transaction for a second time you know there is a violation. So if we chain them in the order we found them:

(T2 -> T1 ) || (T1 -> T2) => (T2 -> T1  -> T2)

 A directed graph would be a useful way of tracking these orderings discovered by applying the rules. Cycles would in this case indicate a serialization anomaly so we can use a directed acyclic graph (DAG). 

In summary

Transactions should see the database as if they were running alone on the system, not being impacted by other running transactions. This is important because we don't know what will happen to these transactions, they could be aborted and in that case relying on the data written by them would cause an anomaly in itself. Ironically it is the measures we put in place to protect against these anomalies that could give rise to a serialization anomaly where results from statements of overlapping transactions wouldn't be retrievable if the transactions had happened serially. This post showcases that for MVCC and aims at providing intuition into why these serialization anomalies are problematic and give basic insights in what a database can do to protect you from it. That is if you chose to run with a transaction level of serializable isolation.

 

Final Notes:

I haven't gone into much detail about locking. Locking can be used to avoid serializable isolation anomalies but it requires aggressive locks (e.g. exclusive locks) that would enforce serial access to the underlying resource. The example is chosen in such a way that the concurrent execution would be possible even with table level locking. For example where during normal database execution writes would block other writes. This is because locks follow the lifetime of a transaction and T1 ended before a write in T2 could give rise to contention on a lock.

I have tried to give a simple explanation for the tracking mechanism above. Note that my example covered only rule 2. Rule 2 is easiest to track as you only need to take into account open transactions. Rule 1 on the other hand uses committed transactions which is troublesome to track since their amount grows with the uptime of your database. You can stop keeping track of committed transactions from the moment they no longer have (direct or indirect) overlap with open transactions. This overlap could disappear because transactions can close (by commit or rollback). But a lot of database instances have a continuous workload which could block this type of cleanup since there always remains overlap with open transactions. The good news is that research exists which investigates use of heuristics to efficiently avoid serialization anomalies. A nice and freely available paper on such a heuristic is "Efficiently making (almost) any concurrency control mechanism serializable".

Sunday, December 23, 2018

Setting up Pyhon 3.7 with SSL support

Intro

If you are lucky enough to have Python 3.7 in your OS repositories then you can skip this post if not you might find a hard time setting up Python 3.7 with SSL support.  Python 3.7 requires a recent version of openssl but even when that one has been installed in my case it would not find it so I'm documenting the steps on how I got it to work.

1) Get a recent openssl installation

In my case I went for 1.1.1a as that one should be working with Python 3.7.
cd /usr/src/
sudo wget https://github.com/openssl/openssl/archive/OpenSSL_1_1_1a.tar.gz
sudo tar -xvzf OpenSSL_1_1_1a.tar.gz
cd openssl-OpenSSL_1_1_1a/
export CFLAGS=-fPIC  # Make sure we build shared libraries
./config shared --prefix /usr/local/openssl111a --openssldir=/usr/local/ssl  # Make sure we build shared libraries
sudo make
sudo make install

 

2) Get and install Python

It is possible to follow instuctions of https://tecadmin.net/install-python-3-7-on-ubuntu-linuxmint/ with the following changes:

  • Before doing any configure make symbolic links to the openssl libraries.  I tried adding the openssl lib folder to LD_LIBRARY_PATH but for some reason I would still always error out with

    *** WARNING: renaming "_ssl" since importing it failed: libssl.so.1.1: cannot open shared object file: No such file or directory'


    Creating symbolic links in a default library path did strangely enough work:
    cd /usr/lib
    sudo ln -s /opt/openssl/lib/libcrypto.so.1.1
    sudo ln -s /opt/openssl/lib/libssl.so.1.1
    
    
  • When doing the configure specify additional details

    sudo LDFLAGS="-L/opt/openssl/lib" ./configure --enable-optimizations --with-openssl=/opt/openssl > /tmp/configure_output


  • When doing the make if you redirect stdout you will only see the warnings if they don't show:

    • *** WARNING: renaming "_ssl" since importing it failed: libssl.so.1.1: cannot open shared object file: No such file or directory
    • *** WARNING: renaming "_hashlib" since importing it failed: libssl.so.1.1: cannot open shared object file: No such file or directory


      then you are good to go.


  • You can validate by importing the python ssl lib:
    $ /usr/local/bin/python3.7
    Python 3.7.0 (default, Dec 23 2018, 10:35:52) 
    [GCC 6.4.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import ssl
    >>>
    
  • No error means you should be good to go.

    Tuesday, August 30, 2016

    My attempt to the AWS Solution Architect professional exam sample questions.

    For this practice exam the correct answers are given on the Japanese version of the practice exam which can be found here.  The English practice exam is available from here.  And in this post I like to provide my reasoning in why the given answers are the correct ones.  I have done similarly for the DevOps professional exam here.

    Question 1: Best RTO for on-premise Content Management System

     - Answer A will and is the best of the provided options because storage gateway is already used and it's volumes can be converted to EBS volumes.  RMAN backups in S3 also allow restoration into EC2
     - Answer B is not acceptable since Glacier storage takes recovery times >= 3 hours
     - Answer C: There is no need to attach a AWS Storage gateway to the EC2 instance, better to use an EBS volume
     - Answer D: AWS Storage Gateway-VTL is for tapes so no need here as you had a storage gateway volume

    Question 2: ERP application in multiple AZs

     - Answer C is valid and allows to restore data up until 5 minutes from the issue (so RPO of 15 minutes is met).  Since you have hourly backups as well in S3 you can quickly restore these and you only need to replay transaction logs for max 1 hour. Furthermore S3 provides excellent data retention.
     - Answer A is not acceptable since Glacier recoveries take too much time > 3 hours
     - Answer B is not good for this scenario as it is unknown how the data corruption occurred.  Probably data corruption is introduced by a logical error rather than issues on storage level.  Since synchronous replication only makes sure you write the changes on a 2nd system as part of your transactions it doesn't allow to recover to earlier time to protect for these corruption errors.
     - Answer D is unacceptable because even though instance store volumes might allow to take quicker backups they are volatile and should not be relied upon for database backups (they are also only accessible from 1 instance and therefore data is only in 1 AZ)

    Question 3: Random acts of kindness

     - Answer B is good as it is a cheap way that allows you to operate without maintaining infrastructure
     - Answer A is not good as IAM users should be internal users of your organizations.  You should not use these identities for 'web' users.  One reason would be because the amount of users would be limited.  Even if you would map them to a single 'application' user it would not be a good practice to do so.
     - Answer C is not good again because of IAM user usage as well as introducing additional unnecessary infrastructure (incurring costs)
     - Answer D introduces unneeded infrastructure incurring unneeded costs.

    Question 4: Protecting SSL

     - Answer D is the best. CloudHSM is hardened to make sure SSL certificates cannot leave the device.  Furthermore its design and external certification certify that Amazon employees won't have access to them either.  Since Amazon employees also don't have access inside your EC2 instance it is good to store your logs on an ephemeral volume using a randomly generated AES key.  This means that you will lose your logs upon stop/start or when you experience a hardware failure but there were no retention requirements mentioned for the log files.  Since the volumes use this random key when mounting you grant your users access by granting them access to the instance.  The encryption makes sure that data is encrypted at rest and that physical access does not compromise your data.
     - Answer A is generally a good solution but since in this case security is the main concern it is not the best solution.  By offloading SSL at your Load balancing tier you have the traffic flowing in plain text from ELB to web servers.
     - Answer B is not good as there is no way of protecting your private key in the Amazon S3 bucket.  Since your instances need access to the S3 bucket to retrieve the key, employees could do the same and therefore compromise the key.
     - Answer C is good but it does not really protect your logs as you cannot write them straight into S3.  S3 is an object store and cannot be used reliably as a block device.

    Question 5: Fat client application 

     - Answer D is the best.  Using the SSL VPN client the users can securely connect to the VPC and have access to the private subnets.  The fat client can then connect over the VPN tunnel to the application servers which are safely in the private subnet.
     - Answer A does not make sense, AWS Direct Connect is to allow a 'private' line from your data center into AWS and therefore does not come into play for this scenario
     - Answer B is not valid as you don't want to publish the application on the internet therefore an ELB by itself won't help
     - Answer C is not valid as you still place your application servers in the public subnet.  Having the IPsec VPN connection is meant to avoid this need.

    Question 6: Legacy engineering application migration

     - Answer B is indeed the way to go an initial sync followed by incremental syncs to make sure you get all the data in the latest state within the time frame.  If needed you could perform multiple incremental syncs (note that these would incur additional cost as you would be consuming more bandwidth)
     - Answer A is not valid as it does not provide a solution to time needed to transfer the 900 GB of data
     - Answer C is not valid as AWS Import/Export is not to migrate data within 48 hours
     - Answer D is not valid because it says to copy the data on Friday which again does not provide enough time to transfer all the data.