region = geographic area with at least 2* availability zones, independent collection of AWS computing resources in a defined geography, currently 25 regions (as at Nov 2020) including 2x GovCloud and 2x China:
us-east-1 - US East North Virginia
us-east-2 - US East Ohio
us-west-1 - US Northern California
us-west-2 - US Oregon
ca-central-1 - Canada central
sa-east-1 - Sao Paulo
eu-west-1 - Ireland
eu-west-2 - London
eu-west-3 - Paris
eu-central-1 - Frankfurt
eu-north-1 - Stockholm
eu-south-1 - Milan
ap-east-1 - Hong Kong
ap-northeast-1 - Tokyo
ap-northeast-2 - Seoul
ap-northeast-3 - Osaka
ap-south-1 - Mumbai
ap-southeast-1 - Singapore
ap-southeast-2 - Sydney
af-south-1 - Cape Town
me-south-1 - Bahrain
cn-north-1 - Beijing
cn-northwest-1 - Ningxia
us-gov-east-1 - US Gov Cloud East
us-gov-west-1 - US Gov Cloud West
Breakdown:
6 Americas - 4x US + Canada + Sao Paulo
8 EMEA - 2x UK&I + 4x Mainland Europe + Bahrain + Cape Town
7 APAC
Regions are within a geographic area designed to provide HA to a geographic area, designed to be isolated from failures (e.g. us-west1a, us-west1b, us-west1c)
to distribute load across AZs us-west1a may be different hardware to us-west1a in another AWS account, see doc
AZs within a region are connected via low-latency network
vs AWS Local Zones which have selected AWS resources (compute and storage) closer to customer for latency advantage. Different pricing than parent region.
vs AWS Outposts which are AWS fully managed compute/storage racks on-premise
vs AWS Wavelength which have selected AWS resources near 5G telco providers
edge = CDN endpoint for cloudfront , where content is cached (see AWS CloudFront section)
support
levels:
basic: free
developer: general guidance <24 hours, system impaired <12 hours
business: developer + production system impaired <4 hours, production system down <1 hour. Support API.
enterprise: business + business critical system down <15m. Technical Account Manager, Well-Architected/Operations Reviews
increasing level of architecture, dedicated support. After business get full trusted advisor checks, Support API (manage support cases, and trusted advisor)
ARN Amazon Resource Name
format: arn:partition:service:region:account-id:resourcetype:resource (could have / too)
massive amount of data, online analytics (OLAP) = Redshift
relationship between objects is the data = Neptune
temporary fast small storage volatile = Elasticache
AWS S3 - Simple Storage Service
simple storage service, object based storage (not block), secure, durable, stores files, only pay for used storage
good for: i) web-content, media as target for CloudFront, ii) static web sites, iii) data lake for analytics, iv) backups, archive via Glacier
bad for: file system (EFS), structured data and query (RDS, DynamoDB, CloudSearch), rapidly changing data if read/write latency important (EBS, EFS, RDS, DynamoDB), archive (Glacier), dynamic web site (EC2, EFS)
data spread over multiple devices and facilities (can lose 2 facilities (availability zones?) and be OK). All classes have 99.999999999 durability (11 9’s) except RRS (99.99%)
files 0 byte to 5TB (but largest put is 5GB so need multipart). Theoretically unlimited storage. Stored in buckets. Bucket names must be universally unique (e.g. https://.amazonaws.com/), lowercase
for performance best to store large files (gzip batches of small files together), sort keys (names) before uploading dataset, use concurrent thread/machine download, use exponential backoff for retry
consistency
read-after-write consistency for PUTS (new data immediately accessible to everyone after commit)
“HEAD or GET requests of the key before an object exists will result in eventual consistency” = could return stale data (theorised cache behind S3 caches miss)
eventual consistency for overwrite PUTS and DELETES (i.e. takes time to propagate). No such thing as POST
updates to a single key are atomic
theorised that there’s a cache (so new objects are misses and therefore consistent read after write, but PUT/DELETE need cache expiry)
encryption
server side encryption - transparent to client (so client passes/downloads unencrypted, except TLS)
SSE-S3 - managed key for each object protected by rotating master key (“multi layers of security”). AES-256. Common/simple, can’t lose the key as AWS manages it. The only one that allows cross-region-replication
SSE-KMS = similar to SSE-S3 but separate perms for envelope key (a key that protects your data’s encryption key) so can be audited. (who is decrypting, failed decryption). Can use own key or AWS provide a region/role specific-key
SSE-C = client provides symmetric encryption key (AES) on upload/get and AWS does encryption/decryption then forgets key. Each object/version can have it’s own key (client has to track this). If lose key, object is lost.
client side encryption - client encrypts it before uploading to S3. Provides full end-to-end control of the encryption and decryption of objects. Needs a library, e.g. Amazon S3 Encryption Client. Could still use KMS-CMK to store key (i.e. client gets key from KMS and does encryption/decryption client-side)
4-levels of key management: i) SSE-S3: AWS handle everything, ii) SSE-KMS: I own CMK but put in place AWS can get to (i.e. KMS), iii) SSE-C: don’t store private key in AWS but still want server-side encryption so have to pass key in API requests, iv) client-side: don’t want AWS to touch key at all
HIPAA compliance implies SSE-KMS (CMK controlled by you using envelope encryption and has audit trail, since KMS itself not HIPAA compliant, don’t send PHI to it)
comprised of: key/value store with version, metadata (tags, dates), subresource (not covered), ACLs
tiered storage classes
standard- designed for 99.99% available (99.9% SLA when credits apply)
S3 IA - infrequent accessed - 99.9% available (99% SLA, lower than RRS), requires rapid access when needed, immediately accessible, cheaper than s3 but retrieval fee, not cost-effective for objects <128KB (extra metadata required), 30 day minimum
S3 IA 1-zone - 99.5% available (99% SLA as with regular S3 IA)
S3 RRS - reduced redundancy storage - 99.99% availability (99.9% SLA) and durability over the year , can only lose one facility (availability zone?), good for stuff that can be lost.
More expensive than standard now as AWS are deprecating it
Glacier (see also Glacier section below) but S3 Glacier storage slightly different: cannot store directly into glacier (need lifecycle policy). 90 day minimum
s3 glacier retrieve via S3 API or management console (cannot use Glacier API - since vault/archive concept not in S3; not aws s3 cli either) into RRS for defined period (Glacier is 24 hours and download from Glacier itself)
pricing - storage (cheaper as more), requests, transfers
lifecycle management (migrate from IA, RRS, Glacier) - minimum of 30 days to transition to S3 IA, won’t transition objects <128KB (extra metadata makes it not cost-effective), but glacier can be done 1 day, permanently delete, manage versions too, can’t move things to RRS with lifecycle management
versioning - at the bucket level, once enabled can only be suspended (can’t be turned off), versions takes space
add versioning with MFA Delete for additional security (need MFA to delete)
only owner can permanently delete
cross-region replication (CRR) - replicate new or updates uploads and deletes (but not if delete specific version) to another region/bucket, requires access policy to read object and ACL, requires versioning on both sides (replicates ACL, tags, creation date too)
why: latency, additional data security, compliance
can change storage classes, ownership (e.g. if cross-account)
encryption: only some encrypted objects replicated (SSE-S3, SSE-KMS). Can’t do client-side encrypted objects. By default SSE-KMS (CMK in KMS) not replicated, but possible with extra steps: need decrypt permissions with source key, encrypt permissions with dest key, need dest key CMK in source region KMS)
does not replicate: existing objects in source, delete when object version is specified, objects created by replication (e.g. if you chained replication so can use this to have two buckets replicating each other for durability), objects from lifecycle (i.e. only user-initiated, but can have some lifecycle policy in other bucket which sends to glacier for archiving)
limits: only goes to one bucket, replication rules based on path prefixes (kinda dumb)
analytics
data lake for Athena, RedShift Spectrum, Quicksight
IoT streaming repository for Kinesis Firehose
machine learning, AI storage for rekognition, Lex, MXNet
storage class analytics for S3 management
tags - helps billing (assign to project/customer), attribute based access control (ABAC)
why ACL? object is owned by someone else in your bucket (i.e. you allow upload), object owner has to allow access via ACL; might be easier to have object-specific access than bucket policy since ACL is object-level
user-based: IAM
MFA before delete or change versioning state - data protection
can have a bucket policy to require server side encryption (although this probably doesn’t work for client-side encryption as AWS won’t know if it’s encrypted or not)
encrypting existing objects may involve reupload or a copy?
pre-signed URL - even if object is private, pre-signed URL can access it (provided who created the URL has access), is time-bombed
restricting access to private content (e.g. pay a fee to get content, subscribers-only). Methods:
Origin Access Identity (OAI) - special IAM user for CloudFront where S3 bucket has granted perms and S3 has removed public access. Enforced by S3, not CloudFront. Doesn’t work if S3 bucket is a website (setup S3 website as custom origin and restrict by header instead).
signed URL - CloudFront trusts a signer and your custom app creates magic URLs that can be time-bombed. Can’t have query params Expires, Policy, Signature, Key-Pair-Id
signed cookies - CloudFront trusts a signer and your custom app sets the magic 3 cookies (has a signature) that can be time-bombed
inventory - list everything under bucket/prefix, delivering results to bucket, schedule daily/weekly and can trigger SNS/SQS/Lambda. Useful for big data (first step usually list all available files)
access log tracks per access request: requester, bucket, status. Help understand usage/bill. Best-effort logging delivery (could be late, incomplete)
events - trigger SNS/SQS/Lambda when something happens (put, delete (but not lifecycle, restore from Glacier) at prefix/suffix level (e.g. trigger Lambda to create thumbnails from .raw files)
performance tips: typically 3500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix. To get more:
for large files, need to design API call for large options (multi-part upload API: initiate multipart upload request which gives you a upload ID, do 1-10,000 uploads referencing ID returning etag (during upload, MD5 sum of content assuming no SSE-C), complete upload with upload ID and etags, use lifecycle policy to clean abandoned uploads)
concurrent GET with byte ranges of 8-16MB parts with one connection per ~90 MB/s available. Add one at a time, could hit CPU limit
for REST API, should pool connections to minimise expensive SSL handshake
aggressive retry timeouts for latency sensitivity (retry will try different path) with exponential backoff (after 2 seconds, retry after 2 seconds with new DNS lookup/connection, and if that fails, backoff)
S3 transfer acceleration - use CloudFront to accelerate upload to s3, upload to edge which then transfers to s3, magic URL, costs money, further away gives better improvement
supports bittorrent - generate .torrent file and can download from it, only 5GB (largest PUT size), only regions created before May 31, 2016
previously advised to use hash prefix, but don’t need to anymore to get (previously 500/800 PUT/GETS per second). Still useful though, as create 10 prefixes, get 10x performance.
S3 select - in-place SQL query over CSV, JSON, or Apache Parquet format with GZIP or BZIP2 (for CSV and JSON objects only). Output is CSV or JSON. All forms of SSE supported (SSE-C needs to provide key in request), but not client-side encryption.
S3 batch - run operations (copy, replace tags, retrieve out of glacier, Lambda) on list of files specified in manifest file. Since applies to each file, can’t aggregate small files into one archive and then put to Glacier.
AWS Macie: uses machine learning to discover, classify sensitive data, continually monitors. Trigger alerts. Dashboard to see how used/accessed.
Glacier
archive, secure durable storage, could be 3-5 hour wait (minutes if pay more), cheapest (1c/GB/month). Retrieval costs #reqs/month. Assumes 90 day minimum before delete, 99.999999999% durability (same as S3). Region specific
retrieve first 5% of average monthly storage for free
vault = container for storing unlimited number of archives. URL is https://glacier.us-west-2.amazonaws.com//vaults/myexamplevault
archive = data (photo/video.file) with an ID and description (optional). Immutable. Can exist in one or more vaults. URL is https://glacier.us-west-2.amazonaws.com/111122223333/vaults/ myexamplevault /archives/NkbByEReallyLongEXAMPLEArchiveId
24 hour cooldown period to confirm it’s working (abort or complete it). Immutable
apply policies like: WORM model (Write Once Read Many), only allow deletes after 1 year
vs vault access policy which is mutable.
access = IAM provision access for users
no KMS support (AWS manages keys or you do client-side encryption). Appears S3 objects with SSE-C/SSE-KMS still transitioned but encrypted by Glacier’s default key.
job = async operation to used to retrieve archive or vault inventory list. Use SNS to notify job completions. Download job output after completion. URL is https:////vaults//jobs/
then download in chunks within 24 hours.
can expedited (1-5 minute) for $ (either on-demand or provisioned), standard, or bulk (lowest cost retrieval taking 5-12 hours)
range retrieval - restore specific megabytes (need tree-hash, hash per megabyte). Seems useless with compression
glacier select - like S3 select but worse, only uncompressed CSV. Have to initiate select job and uses standard glacier retrieval times (expedited/standard/bulk) and stores results in S3
supports SSE-S3/SSE-KMS. No SSE-C (S3 select supports this) or client-side support.
deep glacier - cheapest storage but more expensive retrieval pricing, minimum 180 days (glacier is 90 days, IA is 30 days), 12 hour retrieval, no “select” capability
limits = management console can only create/delete vaults (use REST API for archive/job operations - SDK or CLI). Archive size is 1 byte to 40TB
S3 Vs Glacier Cost Shootout
Assuming:
Ohio region in Nov 2019
20 TB per month of storage
30 requests per month
30 PUTs and GETs per month
5 TB retrieved per month (with no Select)
Then:
if retrieving <13% of the data (cheapest to most expensive): Glacier expedited < S3 1z IA < S3 standard
if retrieving 13%-87% of the data: S3 1Z IA. Glacier expedited only cheaper than S3 standard at <32%
if retrieving >87% of data: S3 standard < S3 1Z IA < Glacier expedited
Notes:
See Google Sheet “AWS S3 Storage Shootout”
1-zone IA not that much cheaper than IA (storage $0.01 vs $0.01.25)
at IA and beyond, start paying retrieval per GB (in addition to per number request)
Standard really is for frequent access. Cheapest storage once >86% of data is retrieved.
Glacier expedited retrievals: need to keep number of requests low (less than 13%, since $10 per 1000 requests), GB returned significant (more expensive than storage of any class)
AWS Storage Gateway
VM deployed on-prem for seamless secure connectivity between AWS S3/EBS/Glacier and on-prem/internal to either S3 as files or a snapshot (volume)
why? on-prem disaster recovery, migrate to cloud (lift-and-shift)
supports VMWare ESIx or Microsoft Hyper-V but could be on EC2 (launch AMI, EFS can be mounted on-premise, perhaps iSCSI over network)
Cloud tape backup to S3, Glacier, or Glacier Deep Archive
Why?
Elastic network share, disaster recovery
SAN, migrate to cloud (32 TiB in cloud with only 300 GiB local). Cheap storage.
Low-latency SAN, migrate to cloud, DR
Local gateway access
NFS, Samba (SMB)
iSCSI
iSCSI
iSCSI
Local gateway has
Cache and files yet to be written to S3
Most recently used files (full volume in S3)
Full volume
Data waiting to be written to AWS
Transfer to S3
Async
Async volume or schedule
Async volume or schedule
Controlled by backup software. No dedicated local gateway storage so async???
Direct access to files in S3
Yes
No, EBS snapshot to volume gateway or EBS volume
No, EBS snapshot to volume gateway or EBS volume
No, virtual table library not shown in S3. Use backup application to access data out of VTL (if in Glacier, need to request retrieval)
Consistency
- can have multiple reader/writers (i.e. other gateways) but uncoordinated = unreliable. - single writer highly recommended - need polling or RefreshCacheAPI call or CloudWatch event for gateway to know if in S3 or lifecycle deleted it
Only via gateway or EBS snapshot
Only via gateway or EBS snapshot
Disk allocation
Cache
Cache, Upload buffer
Upload buffer, stored volume
Cache, Upload buffer
Limits
- 1 file share per bucket - 10 file shares per gateway - 1024 char path limit - 5TB file size (same as S3)
32 x 32TiB volume (more than 16TiB can only restore back to gateway) = 1 PiB
32 x 16 TiB volume (EBS limit) = 0.5 PiB
- 1500 tapes per VTL - 100 GiB to 5 TiB tapes - Total 1 PiB per VTL
Gotchas
- no sym link - rename instance on gateway but copy-put then delete in S3 (eventually consistent)
- Need enough space locally for entire stored volume - Restore from AWS needs to copy down to stored volume so takes time
Readable via tape application only
can throttle bandwidth
on-prem migration: use stored volume gateway initially to move data, then cached
storage gateway snapshots
snapshot includes whatever is in local storage (even if not uploaded to AWS yet)
incremental snapshots uploaded
schedule every 1, 2, 4, 8, 12, or 24 hours
can read back via storage gateway (since stored volume has primary data in the gateway, will take time to download) or EC2 as a EBS volume
file gateway:
1:1 file naming, select
storage classes set at file share level support S3 standard, standard-IA and one-zone IA. Use S3 lifecycle to transition to glacier (if file is in glacier, will get IO error on read, need to request retrieval first, could trigger this off CloudWatch events)
has local metadata cache to make “ls” fast, can be updated via API (e.g. if someone writes to S3 behind you)
only uploads changes (which makes editing files fast, don’t need large local file storage to manipulate big files)
version via S3 versioning
security: all support KMS. Supports Challenge-Handshake Authentication Protocol (CHAP) to authenticate iSCSI and initiator connections
monitoring:
cache hit ratio (at least for cached volume) to see if cache needs to be resized
cost = storage space ($ per GB per mo, comparable to S3), data written to storage gateway ($GB/mo up to 125/mo), snapshots( per GB per mo of EBS), data out from (request), additional costs for tape gateway (penalty to delete archive younger than 90 days but usually delete free)
AWS RDS relational database service
Offering: MSSQL, Oracle, MySQL, PostgreSQL (object RDBMS, i.e. inheritance), Aurora (Amazon’s flavour of MySQL specifically tuned for AWS), MariaDB (fork of MySQL after Oracle bought it)
Aurora
MySQL
MariaDB
PostgreSQL
Oracle
SQL Server
Multi-AZ
Already replicates 2 copies per AZ x 3 AZs Use Aurora replica for failover.
Via AWS tech
Via AWS tech
Via AWS tech
Via AWS tech Still need license in both.
Via SQL Server Mirroring
Can multi-AZ after creation
No
Yes
Yes
Yes
Yes
No
Read replicas (max)
Yes (15 Aurora replicas across AZs within region only)
Cross-region with MySQL
2018 has Aurora Global Database
Yes (5)
Can delay replication for disaster recovery
Can replicate to non-RDS MySQL/MariaDB using native MySQL features
Yes (5)
Use GTID replication to replicate non-RDS MariaDB into RDS.
Yes (5)
Yes (5) but need a Data Guard license (BYOL only)
No
Write to read replica (e.g. add index)
???
Logical replication so yes
Logical replication so yes
Physical replication so no
Physical replication so no
N/A
Read replica of a read replica
Yes
Yes
Yes
No
No
N/A
Read replica deletion of master
Replicas remain active, one is promoted to master
All replicas are promoted to independent master DBs
Replica remains active
All replicas are promoted to independent master DBs
N/A
N/A
Apply DDL to read replica (i.e. add index)
No???
Yes
Yes
No
N/A
N/A
Create backup from read replica
No???
Yes
Yes
Yes
No
N/A
Size limits (may require latest Ec2 instance type and gp2/io1)
64 TB table
64 TB max
64 TB max
64 TB max
64 TB
16 TB
Gotchas
MyISAM not crash-consistent so snapshots might be corrupted (makes read-replica and multi-AZ difficult)
- Enterprise only supported with BYOL - Read replica needs BYOL - No cross-region replica support
No read-replicas
SLA: 99.95% uptime SLA
OLTP (online transactional processing) traditional query (where ID = 123),
OLAP (online analytical processing) pulls in large number of data, run against a warehouse, business questions
cost = EC2 instance per hour, DB storage (doubles for multi-AZ) in GB/mo, SSD (IOPS/mo), magnetic (I/Os/mo), backup storage (although 100% of provisioned space free, pay after this), data out to other AZ (flat), data out to internet (scales after 10TB/mo, first 1GB/mo free)
backups and snapshots
backups enabled by default allowing point in time recovery (to the second) by replaying transaction logs (shipped every 5 min to S3 so can usually backup up to 5 mins ago) allow restore to any point in time (to the second, although 2-step process) - DR
retention period default of 7 days on console or 1 day on API/CLI (1-35 days), except Aurora which is always 1 day default. Changes to this take effect immediately. Don’t use 0 as outage needed to modify to non-zero
no performance hit
snapshot - for checkpointing before large changes, moving between environments/region/account
automatic basic daily snapshot (during your maintenance window) and/or manual
will incur latency hit during snapshot, so define backup window, I/O may be paused (seconds) and higher latency. I/O not suspended for multi-AZ deployment (can do on read replica on MySQL (InnoDB only), MariaDB, PostgreSQL; not on Oracle)
stored in S3 (not deleted), free storage equal to provisioned size of DB (e.g. 10GB provision can have 5x 2GB snapshots for free)
restored DB will have new RDS instance and new DNS endpoint, can change AZ, share, copy to diff region, DB vendor, hardware
will be initially slow as EBS volumes hydrated on-demand from S3
on delete: option to create final snapshot and automated snapshots will be deleted
security
encryption at rest for all RDS offerings. Using KMS. No performance hit. Underlying storage, backups, replicas, snapshots encrypted to. Can’t remove encryption of existing DB
to add encryption: create DB snapshot, copy snapshot using KMS encryption key, restore from encrypted snapshot
of network via VPC, deploy to private subnet
of authorisation via IAM, can add MFA to protect delete; use regular DB users/roles for object security. Can allow IAM or AD users to log into DB
EBS volumes - gp2, io1, also standard (but as a read replica only and not recommended anymore). No support for st1 and sc1
might not get all provisioned IOPS: constrained by threads/locks, buffer/cache are efficient
multi availability zone - using same DNS endpoint (i.e. no connection string changing, it’s automatic failover). Usually takes 1-2 minutes. For disaster recovery only. Free data transfer (but pay for instances).
DB subnet group is a collection of subnets for RDS. Should have at least one subnet per AZ. Even if single-AZ, should have multiple to support multi-AZ in the future (RDS will pick you preferred AZ)
Performance: synchronous replication (so can incur write/commit latency). Can’t read standby (failover only). Recommend io1-optimized instances. Improves backup since standby is used
Can add multi-AZ later where AWS tech used (so not Aurora or MSSQL)
Not for Aurora, it replicates for you (2 copies per AZ x 3 AZs = 6)
Process: snapshots primary DB, restores snapshot to other AZ, then set up synchronous replication. Avoids downtime but has a performance impact. Read replicas should continue.
Exceptions: MSSQL uses SQL Server Mirroring so doesn’t use snapshot process. Aurora storage already multi-AZ, use Aurora replica for failover (better than RDS multi-AZ since you can read it)
OS/DB patches applied to standby first. When done on master, will require outage.
Can’t control which AZ or how many (will be only be 1 standby), could have cost implications if EC2 instances are in different AZ
what happens during multi-AZ failover?
block-level replication between EBS volumes. Failover triggered by automation (monitoring of quorum) or API, start standby, sync logs, DNS switch, create another standby
test “reboot with failover” and ensure apps respect DNS TTL
read replica - read-performance, async replication (so “replica lag” metric) from source to up to 5 replicas (so not as data durable as multi-AZ), each has on DNS endpoint (so app has to decide how to route traffic).
Why? Reduce latency cost (move data closer to user, cross-region), read-only reporting instance, allow reads during master maintenance, DR of master (although async replication from master and manual operation), could test engine upgrade (MySQL/MariaDB only which allow DDL), speed up backup for MySQL, MariaDB, Oracle, and PostgreSQL
Pay for additional instances but data transfer is free
replica can replicate multi-AZ or be multi-AZ itself
on source DB delete - one replica is promoted to master to accept writes
limits
requires automatic backups enabled (created by taking a snapshot of the source), can be promoted to real/source DB, replica can be sized differently (but can create “replica lag”)
only supported by Aurora, MySQL (InnoDB only, although FAQ says MyISAM could be used but terrible because not transactional so possibly corrupted), MariaDB, Oracle, and PostgreSQL. No MSSQL as read replica relies on vendor-specific tech.
can’t be snapshotted for some vendors, use multi-AZ instead. I guess it’s not advised to backup this way as replica may be suspended (seconds) to create the backups
doesn’t do automatic promotion on master failure - because it’s dangerous (replication lag)
scaling
EC2 instance type: usually up, will cause small outage (new instance provisioned, attached to storage, reboot, change DNS).
storage type: takes performance hit as requires a copy operation, and outage if single-AZ with custom parameter group
storage size: can’t reduce, only increase. Older generation has performance hit, after initial scale then instance is “storage-optimization” then fast
storage IOPS for io1: no outage, can go up/down
can auto-scale storage (greater of 5 GiB or 12% of current storage)
older-generation MSSQL requires outage
vs DynamoDB which scales on the fly
reservations - reservation (DB engine, DB instance class, Multi-AZ Deployment option, license model and Region) so modify instance to be different, then will be billed on-demand. Can create new reservation that matches though and sell reservation on the market place. Can switch AZs. Can be used on read replicas
monitoring
enhanced monitoring provides more metrics and 1-second granularity (costs more but is free-tier), agent on machine so CPU could be different (from regular which takes it from hypervisor so suffers from busy VM server issue)
performance insights - top SQL statements, average active sessions, finds bottlenecks
maintenance - specify window for outages (DB engine upgrades) although not all changes require outage (e.g. OS, AWS patches)
Minor versions applied automatically or manual. Major versions applied manually. Version deprecations schedules 3-6 months before. Note minor/major is DB-vendor specific
if instances stopped, will be restarted after 7 days for maintenance windows (stop manually afterwards)
maintenance states: “required” (will apply, can’t defer), “available” (not automatic but could apply manually), “next window” (automated for next window), “in progress”, and “none”.
can defer “available” indefinitely (or apply immediately, or during next window). Required can defer too but will be scheduled in the future.
limits - 35 day max backup
Oracle: no RAC support, can’t use RMAN to restore into RDS (presumably no shell access to underlying EC2 instance and can’t get files onto instance; can take RMAN backups and ship to S3 as it seems invoke RMAN from SQL via package function???
SQLServer: can’t expand storage (restore backup into new instance), max 30 DBs per instance
100TB total storage; max size is typically 6TB, except MSSQL at 4TB, and Aurora at 64TB
40 instances; 40 clusters
can’t stop instance if it has read replica or multi-AZs, will auto-resume in 7 days
AWS Aurora
Amazon’s drop-in-replacement for MySQL/PostgreSQL, only in AWS. 5x perf of MySQL, 1/10 cost, similar availability.
Cloud native, doesn’t use EBS volumes so very low replication lag (auto promote replica to master), underlying storage is multi-AZ (still need AZ-specific EC2 instances). But…
in-region snapshots only (can’t be shared; have to use MySQL replication cross-region and then Aurora replicas of this)
Pricing models: on-demand (just like RDS) or
Aurora Serverless (pay per Aurora Capacity Unit (ACU) per hour, ~2GB memory with compute/network) for volatile workloads.
Can “pause” after default 5 mins (just pay for storage, if paused longer than 7 days might be snapshotted to S3).
Warm fleet to scale-out quickly (although if not fast enough (up: 5-50 sec, down: <15 min), then set min/max capacity), don’t pay for idle. Might not scale (if long-running query, locks, temp tables). Scaling step is doubling. Will failover to another AZ for you (could be minutes).
Aurora Serverless v2 (in preview as of Dec 2020), double the price but scales up instantly (<1 second viable), and down faster (<1 min). Scale granuality is 0.5 ACU (no more doubling), min capacity is ACU.
Aurora Global Database (one region primary, second region read-only). Async replication (<1 second typical). Allows failover <1 minute (feels not automatic, have to trigger yourself???). Can have replicas on secondary to further scale reads.
limits: MySQL only
scaling: starts at 10GB storage, increases automatically in 10GB (probably shard size?) increments to 64TB. Auto-reshaping. Not as instant like DynamoDB but push-button (minutes)
redundancy: 2 copies of data in each AZ with at least 3 AZs (6 copies).
transparently handle loss of two copies without affecting write, loss of three copies without affecting read. Self-healing disks (continuously monitor blocks and disk for errors).
auto-promotion of read replica (this is manual with others probably because of replication lag)
cluster volume shared by primary and replicas (possibly explains low replication lag)
backups - same as RDS above (up to 35 days, but possibly restore only up to last 5 minutes - frequency shipped to S3)
storage cluster has a “primary instance” (read/write) and Aurora Replicas (read)
3 flavours of replicas:
Aurora Replicas: up to 15, low replication lag (typically 100ms), in-region only. During automated failover, replica to promote chosen by priority (0: highest to 15: lowest). Can use auto-scaling to increase/decrease number of replicas.
additionally Aurora MySQL replication, up to 5 which can be cross-region. Can’t be used for automated failover (manual and possibly minutes of data loss). Could replicate RDS MySQL master to Aurora MySQL as part of migration to Aurora.
Aurora replica of RDS PostgreSQL.
write role has cluster DNS which fails over in case the primary dies. Replicas have their own endpoint URL which is LB’ed. Or you can hit instance endpoint directly too
Aurora MySQL can invoke Lambda function sync or async (native function)
migrating to Aurora:
create from RDS MySQL snapshot (can be no downtime for this)
EC2/on-premise MySQL, can use mysqldump and pipe to Aurora directly (for small DBs, incurs performance hit), or binary log replication (less downtime - ???downtime just for initial copy))
using DMS
under the hood magic:
cloud-native storage. 10GB partitions. Managed as a log with data generated by storage (so not waste CPU on storage). Each segment has own redo log to reduce lock contention.
high throughput via quorums - acknowledge when 4/6 storage nodes acknowledge, data generated async
Aug 2019 - Aurora multi-master for MySQL - masters across multi-AZs, allows continuous writing during failover (no promotion of replica).
replication and quorum - all 6 storage nodes need to confirm (written to cache), on commit then persist. Conflict detected at 16 KB page level (could be MySQL only; so should be low contention), can avoid by: designated writer per shard, retry.
optionally pick Global Read-After-Write Consistency (GRAW) at cost of synchronous lag for strong consistency.
AWS DynamoDB
DynamoDB = no-SQL DB service, millisecond-latency, supporting document and key-value data models, flexible (add “columns”), “push button” scaling on the fly without downtime (provision more read/write capacity units), KMS integration
always on SSD, spread across 3 geographical data centers (not AZs though,weird)
eventual consistent reads - consistency across all copies usually within a second (best read perf), default
strongly consistent reads - most update to response of all successful writes (i.e. the most strict), but might not be available during a network outage, scans will take twice the amount of reads
pricing based on storage and either provisioned throughput or on-demand throughput:
read unit = a 4KB read. Eventually consistent reads consume half a unit. Transactions require double.
write unit = a 1KB write. Transactions require double.
provisioned throughput mode: provision constant read/write capacity per second - better for known demand, traffic changes slowly, easier to forecast cost
can be combined with auto-scaling: uses “target tracking” policy to achieve utilisation target
separate read/write policies. Specify min/max. Can also be applied to global secondary indexes too
gotchas:
scaling lag
DynamoDB doesn’t scale down if load drops to zero (can’t tell if demand is truly zero or just not being used temporarily). Workarounds:
slowly send requests to DynamoDB until it scales down (i.e. don’t immediately drop requests to zero)
manually reduce capacity to the minimum capacity
requests beyond capacity are throttled (i.e. error). Adaptive capacity and bursting last 5 minutes of unused capacity can help reduce this
cost per RCU-hr and WCU-hour
on-demand mode: scales up/down based on reads/writes - good for unknown demand, instantly provisions capacity (no scaling lag). Cost per number of read/write requests. More expensive than on-demand but should be less wasteful and no throttling
can reserve capacity too and mix with provisioned capacity.
Eventual consistency requires half num of reads. Writes more expensive
primary key = partition key (is hashed and this determines physically where data lives, AKA “hash attribute” or “hash key”)
max size of partition is 10 GB. Limits blast-radius of failure for AWS.
(partition size = round_up( max (RCU/3000 + WCU/1000), total size/10GB)
Gotcha: you’re actually paying for partitions, not RCU/WCU. If you have a hot partition, may need to increase capacity enough to use a new partition (you can’t tell DynamoDB to rebalance). Identify hot keys by listing logging it during throttle error
or composite primary key = partition key + sort key (orders items within a partition AKA “range attribute”)
secondary indexes allow faster query by different attributes. Look like tables:
local secondary index (same partition key as table with different sort key, i.e. composite key). Local since it re-uses partitions. Consumes extra write capacity to update local index
global secondary index (different partition key and optionally sort key). Global since it can query across entire table
Local Secondary Index
Global Secondary Index
Key
Must be composite
Partition key with optional sort key (if write to base table doesn’t have the global secondary index’s sort key, it won’t be written to index creating a “sparse index”)
Physical structure
Re-uses base table partition and read/write capacity units. 10 GB limit on collection (which may limit number of sort keys)
New table. Own read/write units
Data
Local within partition
Global within table
Attributes projected
All attributes, even if not projected from base table (although consumes read capacity and latency if not projected)
Only keys and projected attributes
When created
At base table creation time. Cannot modify/delete after.
Anytime
Read consistency
Consistent or eventual
Eventual only
Naming
Name must be unique within base table (query by base table and index name)
Globally unique
Performance
Keeping each index in sync with base table consumes write capacity. Reading non-projected attributes costs read capacity and latency
Keeping in sync requires write capacity units. Base table and all global secondary indexes each need to have enough capacity provisioned.
are backed by a “base table” (either original table for local or new table for global)
projections of base table’s attributes onto index to make attribute retrieval faster (at cost of more storage). Partition and sort key are always projected.
best practices:
generally global preferred over local (unless you need consistent read)
keep writes under 1KB to fit in one write unit (i.e. don’t project everything)
small number of indexes better, smaller indexes faster to process
sparse indexes can be exploited (e.g. using a “isOpen=true” attribute on orders and remove it when order shipped, then index contains only open orders; or game leaderboard with a “champ” attribute if top-scorer, index has only champs)
shard global secondary index writes by using random “1-N” value for partition key. Makes querying on sort key very fast as data is partitioned
global secondary index can create ghetto eventually consistent read replica. Why? prioritising reads, isolating reads from writes
many-to-many relationships
model as adjacency list: partition key is top-level entity in relationship graph, and sort keys are other entities. Identity can contain details of the top-level entity
materialized graph pattern: partition key is person ID, sort key is type of entity plus entity ID
to query index, need to include it’s name in your query
sort-key best practices
for versioned data (e.g. audit), write two items (latest at sort key = “v0” and another at sort key = “v<highest number + 1>”. Can include other data in sort-key too
Other best practices
aggregation - music library which has sort keys of “details” (attributes for the song), and “download ID XYZ” for each download (timestamp is attribute). DynamoDB stream can aggregate downloads by month by updating “Downloads 2019-09” sort key. Global index would be sparse index
compress large items (store binary attribute) or store in S3 and use S3 metadata to hold primary key to link back to DynamoDB
time-series data: table per application per period. Most recent table has high read/write capacity, older has lower. Also need sharding to write efficiently
can smooth writes by writing to SQS which retries writes when capacity errors occur
TTL - automatically delete expired items by specifying an attribute that has Unix epoch time of when to delete. Takes <48-hours to delete (will still show up so may need filter out in query where expiration time < current epoch time). See deletion in DynamoDB Streams. Free
good for: session data
DynamoDB streams - capture and processes changes to DynamoDB
why? capture usage metrics, trigger Lambda to send welcome email to new customer, syncing multi-master game servers across regions
stream is an ordered ephemeral (24 hours) queue. Can integrate with Lambda to create trigger functionality, or Kinesis
Transactions - atomic read/write across multiple tables/index (up to 25 distinct items <4MB in a read or write). Global secondary indexes and streams updated after transaction. Consumes two read/writes units per item (one to prepare and another to commit per block, i.e. a 8KB read is 4 read units)
best practice: keep transactions small (to increase chance of success/throughput), use BatchWriteItem for bulk insert instead.
vs BatchWrite which is not atomic
Global Table - multi-region, multi-master using DynamoDB streams behind the scenes. Strongly consistent reads can only be done in-region. Best-effort to determine who last writer is in conflicting writes
pay per rWCU (replicated write capacity unit) per hour
DynamoDB Accelerator (DAX) - in-memory cache to speed eventually consistent reads (from milli- to micro-seconds), write-through cache
good far: latency-critical (real-time share trading/bidding), caching small number of hot items, could reduce level of read capacity units required
bar for: where millisecond response is OK, consistent reads required, are write-intensive, have own caching already
high availability - recommend 3-nodes each in different AZ (1 master, 2 replicas)
pay by node-hour
3rd party JDBC drivers exist
limits = max size of an item is 400KB (max size of string and binary data too). Unlimited attributes within this size.
General No-SQL Partition Key Design
defines physical partition where data stored, need high range of values to evenly distribute (and hence high throughput), keys should fit within one partition
frequently accessed partition keys are “hot” (bad)
adding random suffix (although can’t query by key anymore), or calculate hash modulo but querying all needs enumerating all suffixes and merges
Sub-millisecond, in-memory cache of DB services in cloud, makes read and writes faster. Good for static DBs (faster than DynamoDB). Two different caching engines:
Yes but application has to decide which node to retrieve/store data. Scaling while running not advised (will lose some cache even with consistent caching; ASG not supported anyway so manual via console or API)
Uses read replicas (async replication) to scale reads.
Online vertical scaling supported 2019
For write, cluster mode enabled with Redis cluster client can shard (multiple primary nodes), can online scale-in/out without downtime (not ASG, invoked manually)
Multi-AZ
Yes but tradeoff could add latency for some retrievals
Yes for read replicas (async replication) to achieve high availability/durability
Replication/Durability
No If node dies, will lose data, but remaining shards probably OK.
- Master-slave for HA. Up to 5 read replicas (async replication) which can be in same/other AZ - Failover detection and promotion of read replica to master (flop DNS) - Snapshot back/restore to disk. Useful to pre-warm cache, or scale-up
Feature
- Memcached-compliant - Multi-threaded (so scaling CPU up helps)
Big slab of memory in the cloud. Really just a pool of expendable nodes that grows/shrinks (like auto-scaling group, i.e. replacement of failures, discovery, although Memcached doesn’t support ASG)
Relational database of stateful entities with failover (like RDS)
Memcached - multi-AZ (but latency hit when going across AZs, no replication), simple data types, can cache objects (e.g. database, web sessions, dashboards for streaming data).
Eh? Memcached has no replication/cluster by-design but AWS is distributed. Need to figure out which node value resides using modulo hashing (but when node added/removed lose (1 - N)/N of keys) or “consistent hashing” (nodes in a ring, hash to get position in ring, then walk to next boundary to find node). AWS Elasticache Cluster Client SDK and 3rd party libraries help with this, including handling auto-discovery of new nodes during scale out
Typical architecture: EC2 instances in ASG across AZs with multi-AZ cache cluster. App has to decide which node to cache in. Multi-AZ adds some latency
Monitor cache hits via CloudWatch to right-size
Security: no-real authentication/encryption so private subnet. Protect with security group (inbound from application tier only)
Redis - key-value store with sorted-sets/lists, Lua scripts
Typical architecture: EC2 instances in ASG across AZs pointing to single Redis master (in one AZ) with read replica in other AZ. Application has to coordinate sending writes to masters and reads to replicas (i.e. two connection strings)
read replica is async replication so careful consideration required if appropriate (e.g. display only OK, not good for: edit/update text, counter). Since async, failover could lose some data.
horizontal scale-out: simple keys/counters can be scaled out, but complex (sets, lists, hashes) very difficult. Application has to implement sharding (Redis Cluster spec tries to do this for simple). Could be lots of management of different clusters, discovery, read replicas
topologies:
cluster mode disabled (1 primary with 0-5 replicas): DNS-based 1.5 min failover. Failure affects all writes. Scale <1TB. Larger nodes.
vs enabled (multiple primaries using sharding each with 0-5 replicas). Non-DNS 15-30 second failover. Failure affects only writes of objects for shard. Scale to TB. Smaller nodes but overall more expensive, but more scalable.
failover - auto promotes replicate. If lose majority (no quorum) can still recover
resharding: old-way (backup to S3 and restore but lose writes and downtime). Online resharding invoked via API with no downtime, some perf impact. Can trigger from alarm (SNS, Lambda)
Redis cluster-aware client is topology aware (so knows how many nodes, where to go to get/put object)
Complex datasets (lots of special commands):
leaderboard: uses special commands, ZADD (to add), ZREVRANGE (to get leaders), ZREVRANK (get rank of a player)
recommendation algorithm: maintain counter of likes/dislikes (since Reids can persist to disk, data could live here) and also who liked/disliked. Then feed into 3rd party algorithm
pub-sub: simplistic, not persisted so node failure loses data (use SNS instead)
queues: very-fast, and once-only-delivery (item is “in progress” until worker says it’s done). Need to be careful with node termination (Redis like a DB at this point, not a disposable cache/node)
monitoring: CPUUtilization (CPU percent * num cores; good for 1-2 core nodes) and EngineCPUUtilization (percentage of Redis Core Engine; good for 4+ vCPU). If single core maxed, need to scale-up or use read replicas
vs Memcached: Redis has more advanced data types, is not multi-threaded, snapshots, replication for scaling, pub sub, has encryption, HIPAA compliance, is multi-AZ
Usage examples
Using for web sessions? Memcached says don’t use for web sessions because users will be affected during upgrade/failure. But online seems Redis is popular (anecdotally faster than Memcached, in AWS has HA options)
Aggregating as a cache over multiple data store, computations
Stream analytics/IoT - stream of tweets, find recent tweets, aggregate/cleanse, can sort, can store hash of attributes
Big data - cache processing/cleansing results to feed to analytics step
Geospatial - recommend places near location, cache these in Redis
Fast bidder - ad tech bidding for ad spots, bidder evaluates (Redis sets, intersections, scripting) to respond extremely quickly and secure best price
Chat messages/rooms - pub-sub in Redis
Fast counter, e.g. API that limits invocations. Low-latency good for API
Cost advantages: cheaper than scaling-up/out (easier to scale horizontally than RDS), saves on data transfer (free transfer within AZ and to EC2 in region - although EC2-side might charge for cross-AZ), save on DynamoDB provisioned capacity
Whitepaper/re:Invent:
design: understand cost of latency and stale data, TTLs match frequency, isolate caches by purpose (metadata slow moving, database updates frequently)
lazy cache: populates on warm up, eviction easy, good for heavy-read/low-write
vs write-through (update cache on write): write-through helps avoid misses, shifts latency from read to write (which matches user expectations), cache expiration easy (always up to date), but could be inefficient (caching things not read), churn, probably still need lazy caching when node fails to fill cache. Can combine with lazy caching.
expiration strategies:
always apply TTL except for write-through (says AWS and whitepaper), expiration helps with cache coherency
for metadata, whitepaper recommends having a scheduled task hit the metadata so it’s always in cache.
rapidly changing data (leaderboard, comments) can have cache of few seconds - band-aid until better solution
russian-doll caching: nested records managed within their own cache and top-level is collection of cache keys (e.g. news page with child stories, users, comments)
policies: noeviction (error at limit), allkeys-lru (LRU), volatile LUR (LRU keys with TTL), allkeys-random (random), volatile-random (random keys with TTL), volatile-ttl (shortest TTL). Non-expiring objects (e.g. metadata) may have no TTL
pre-warming: thundering herd problem. Run a script that hits specific URLs/resources. Do this when node-provisioned, but before added to group for consistent hashing. Could be triggered on SNS when nodes added/removed to ensure warmness
monitor: evictions + cache misses (high together is bad, will be high for lazy cache), bytes used for cache (since OS memory can’t tell how much is used for actual cache), swap (should not swap), concurrent connections (stable, should alarm here)
at steady state: hits > evictions (ideally 9:1 ratio, or no evictions just expirations), bytes for cache is near max
hot nodes: detect via CPU metrics, maybe log node of puts/gets (but adds lag and probably overkill unless you suspect hot nodes). Mitigate with: small cache infront of hot nodes, mapping table to redirect values to other nodes
can auto-scale out until evictions are zero
negative caching - cache the “no result” (e.g. who is top seller for this month? no sales yet so no one), this is still data
vs: CloudFront (caching close to users as possible, not application-level), RDS read replica (same structure/format so still slow, good for geographic dispersion), on-EC2 instance (introduces distributed cache coherency scaling issues, new instances start with cold caches)
Other Storage
Redshift = BI, data warehouse service (OLAP), petabyte scale, columnar data storage, SSL, encrypted at REST
based on PostgreSQL, cluster of nodes (leader, compute nodes), no EC2 access but based on ds (dense storage) and DC (dense compute) instances (note these are special Redshift instances)
pricing based on num nodes and type (e.g. $0.250/hour for dc1.large 0.16TB storage). For more storage add more nodes
MPP (massive parallel processing): leader node (manages client connections, receives queries, free), and compute nodes (store data, perform queries, computations), up to 128 compute nodes
free data transfer within VPC
only 1 AZ, use snapshots to restore to different AZ during outage
1MB block size. Column-based storage (one column per CPU) partitioned by some key (to distribute evenly across nodes)
Redshift Spectrum - allows you to run Redshift SQL against S3 and will deploy across AZs
Athena ANSI-SQL engine on S3 with Presto with SerDe (CSV/JSON aware). Needs data in JSON, CSV or Parquet (best performance)
Why? first-step in analysis (just examine it raw in S3 to see if it’s useful, later load into another engine), archiving old data from DB to S3 but keeping it “online” query-able.
Needs to understand structure, so catalog in Glue or Athena-internal catalog, views
Can save output toA S3
Integrations with
Glue (define data catalog so could store intermediate results to save on requery), output of ETL/crawler runs. Can be over S3 or ODBC/JDBC source too.
Quicksight for visualisation - or connect to Athena with ODBC/JDBC.
logs for ALB, Classic ELB, CloudWatch, CloudFront, VPC flow
vs Redshift Spectrum:Spectrum or joining S3 to existing Redshift tables or unioning, know data structure and usefulness
Athena more S3-only, first-step analysis (don’t know data structure), higher latency
vs S3 Select - S3 Select in-place SQL-like query
Athena more complex: needs data catalog but full ANSI-SQL support (via Presto), more integration options because of Glue, visualise with Quicksight or 3rd party (ODBC/JDBC)
QLDB Quantum Ledger Database blockchain ledger (immutable/transparent) without setting up framework.
Why? Immutable (RDS not inherently immutable)
Serverless so pay based on requests and GB/mo (think of as a database)
Centralised (not decentralised consensus) for higher performance/scalability
vs QLDB: Blockchain about decentralised control (no single entity to control), multiple parties that don’t trust each other
Timestream Database optimised for 10^12 events per day, time-ordered, append-only data, queries are always over time interval. Has analytics (interpolation)
For: IoT, telemetry, sensors
Neptune graph database (relationships) supporting Gremlin and SPARQL
used for: recommendation engines, using “CONNECT BY” syntax
probably not in the exam
DocumentDB MongoDB API compatible (can migrate from on-prem MongoDB)
similar to RDS (need EC2 instances, VPC), pay per hour and data stored/transferred
vs DynamoDB: it’s MongoDB, need EC2 instances (vs DynamoDB pay per read/write unit and storage)
AWS Backup - centralise and automate backups with lifecycles across AWS for compliance, security.
vs AWS Data Lifecycle Manager (DLM) - automate EBS volume snapshots on schedule. Redirects to EC2 documentation. Seems distinct from AWS Backup (doesn’t mention it).
AWS Transfer for SFTP - managed SFTP server. Storage in S3. Pay per hour and GB transferred
Pro-tips:
archiving/backup easy entry-point for business to use AWS
make S3-endpoints within VPC (so not using internet)
AWS provides a disk, up to clients/operating system to decide what to do with it.
EBS - Elastic Block Store
network attached storage, attached to EC2, has OS, apps, can’t attach 1 EBS to multiple EC2 instances
bad for: temp storage (EC2 local instance store), multi-node (EFS), durable storage (S3, EFS), static data (S3, EFS)
reliability: single AZ, so don’t provide reliability themselves (need S3 backup). 99.999% available (annual failure rate is 1-2 volumes for 1000 volumes per year; although EC2 SLA says 99.95%)
gp2 (general purpose SSD)
io1 (provisioned IOPS SSD)
st1 (throughput optimized HDD)
sc1 (cold storage HDD)
What
General purpose SSD balanced for price and performance
High performance (low latency, high throughput) SSD
Low-cost HDD for throughput intensive workloads
Lowest cost HDD for less frequently accessed workloads
Uses
Most workloads, desktops, non-PROD environments
Critical apps requiring sustained IOPS or >16,000 IOPS or 250 MiB/s, e.g. RDBMS, No-SQL
Streaming at high throughput and low cost, big data, data warehouse, log processing
High-throughput at lowest cost. Infrequently accessed.
Size
1 GiB - 16 TiB
4 GiB - 16 TiB
500 GiB - 16 TiB
500 GiB - 16 TiB
Dominant performance characteristic
IOPS
IOPS
MiB/s
MiB/s
Base IOPS (initial)
100 IOPS at <33.33 GiB
100 IOPS
Base IOPS (scaling)
3 IOPS / GiB
Max ratio: 50 IOPS to GiB (e.g. 100 GiB volume, max 5000 IOPS) Recommend >2 IOPS:GiB
Base IOPS (max)
16000 IOPS @ 5335 GiB
64000 IOPS on Nitro (32000 IOPS non-Nitro) Min volume for max IOPS: - 1280 GiB (Nitro) - 640 GiB (non-Nitro)
500 IOPS (1 MiB I/O)
250 IOPS (1 MiB I/O)
I/O size
Can’t actually modify this??? EBS is a network attached.
16 KiB to 256 KiB
16 KiB to 256 KiB Max I/O: - 256 KiB I/O for 32000 IOPS (non-Nitro???) - 16 KiB I/O for Nitro
1 MiB common
1 MiB common
Base MiB/s (initial)
128 MiB/s at < 170 GiB
No initial??? To achieve 128 MiB/s: - @16 KiB I/O: 8000 IOPS - @256 KiB I/O: 500 IOPS
20 MiB/s @0.5 TiB
Scaling exactly linear at 40 MiB/s per TiB (need 12.5 TiB for max)
6 MiB/s @ 0.5 TiB
Scaling exactly linear at 12 MiB/s per TiB (can’t achieve max, must burst)
Max MiB/s
128 MiB/s @ <170 GiB 250 MiB/s @ 170-334 GiB if credits available (@3000 IOPS, need > 85.3 KiB I/O) 250 MiB/s @ >334 GiB
Initially 5.4M I/O credits (3000 IOPs for 30 minutes)
Accumulates 3 IOPS/GiB up to 5.4M I/O credits.
N/A IOPS are provisioned and MiB/s tied to IOPS
Max credits = volume size
“Burst to” scales 250 MiB/s per TiB so: - min 125 MiB/s (o.5 TiB) - max 500 MiB/s (hits 500 MiB/s limit at 2 TiB)
Max credits = volume size
“Burst to” scales 80 MiB/s per TiB so: - min 40 MiB/s (0.5 TiB) - max 250 MiB/s (hits 250 MiB/s limit at 3.125 TiB)
Burst boundaries
Baseline IOPS: >1000 GiB hits 3000 IOPS limit
Baseline MiB/s: see Max MiB/s section
Irrelevant > 12.5 TiB
Burst always relevant since max size 16 TiB is 192 MiB/s
Gotchas/Limits
Very expensive vs gp2 (for equivalent baseline IOPS, io1 is 2x price). Therefore io1 becomes useful when: - need >16,000 IOPS or >250 MiB/s throughput - need provisioned performance >99.9% of the time (vs gp2’s 99%)
general purpose SSD (GP2) - burst up to 3000 IOPS.
aim to deliver 90% performance 99% of the time
provisioned IOPS SSD (IO1) - IO intensive, e.g DB, no-SQL, where >16,000 IOPS required, no burst, consistent IOPS instead
aim to deliver 90% performance 99.9% of the time
cost is storage (GB-mo) and IOPs per month. Very expensive as IOPS cost ~half of storage cost.
throughput optimized HDD (st1) - lower cost HDD for large sequential throughput intensive (streaming, big data, warehouses, log processing). Burst up to 125-500 MiB/s (depending on volume)
aim to deliver 90% performance 99% of the time
cold HDD (sc1) - lowest cost HDD for less frequently accessed throughput intensive (lower throughput than st1).
aim to deliver 90% performance 99% of the time
magnetic (standard) - lowest cost (although previous generation, not even in comparison table anymore), use gp2 instead. Used for cheap infrequent access (although charges per 1M requests which others don’t so in some cases might be more expensive).
volume size: 1 GiB to 1TiB
performance: 40-200 IOPS. Throughput: 40-90 MiB/s
limitations
80,000 IOPS/instance across the board (EBS optimized)
1750 MiB/s/instance across the board (EBS optimized)
IOPS vs throughput: throughput (MiB/s) = IOPS * I/O size (MiB/s). E.g. for gp2, 16000 IOPS s at 0.016 MiB (16 KiB) I/O = 256 MiB/s
I/O size critical for throughput and also dictates maximum size. Possibly related to OS data block size, i.e. 16 TiB limit is 4 KiB data block (2^32)
modifying existing volume, can increase size (need OS tweaks to see new capacity; cannot decrease size, use rsync), change IOPS, or type (except magnetic standard)
may need snapshot (for backup), move AZs, or volumes attached pre Nov 2016, also wait 6 hours before making more changes to the volume
process should be: don’t need to stop or detach volume but perhaps a backup is good idea. OS needs to expand volume.
resize volumes (Elastic Volumes allows hot root modification but older gen requires EC2 instance to be stopped)
encryption - boot/root volumes can be encrypted on newer EC2 instance types (2015). Uses KMS (so could have customer-provided key). Data encrypted in-flight (to/from EC2) and at rest.
previously couldn’t be boot volume (no way to pass key) and had to use 3rd party tools to encrypt at either block (TrueCrypt, LUKS) or file-system level or AWS partner solutions to encrypt boot volumes
snapshot - snapshot (point of time copy) of a volume, exists on S3, records incremental changes (so first snapshot may take a while to create)
why? backups (if AZ dies volumes in it are unavailable by snapshot available in region), change storage types (e.g. DEV on magnetic, TEST on ssd), change availability zone or region, sharing
can take it while in-use/running/attached, but could be inconsistent (need app level pause on writes, or unmount the disk), can snapshot root (can’t snapshot if hibernation enabled)
after initial snapshot, incrementals are fast so pausing app could be scheduled within maintenance window
encrypted volumes auto-create encrypted snapshots, restore encrypts too if storage encrypted (decrypting probably requires rsyncing to new volume)
can share volumes (with another AWS account) if unencrypted. Can privately share (with specific accounts) encrypted snapshot, needs to share customer managed CMK
for RAID (or multi-volume snapshots) API is at EC2 instance level. Does create individual snapshots but they have to be restored together so tag them (to know which array they’re part of, order too). Previously had to stop all I/O, use an agent on EC2 to do this, or create a replica (dumb copy) on one large volume
from command-line, ec2-create-snapshot, or schedule with Ops Automator
IOPS - input/output operations per second, operation measured in KiB (max 16-256 KiB for SSD, max 1024 KiB for HDD as SSD handle small I/O better than HDD), AWS groups operations if contiguous
can be higher/lower IOPS than the limit, e.g. higher if AWS groups operations together, lower if hit EC2 limit or class limit for MiB/s (e.g. GP2 is 640 IOPS = 160 MiB/s / 256KiB)
HDD good at large, contiguous operations, bad at small random operations
are instance (need EBS-optimized EC2 instances - max 48K IOPS/instance, max 800 MiB/s) and volume limits
queue length important (need enough read requests to maintain throughput), but transaction heavy apps are sensitive to IO latency, good for SSD with low queues, throughput intensive apps less sensitive to IO latency, good for HDD with high queues
reliability - 0.1-0.2% annual fail rate (replicated within AZs, related to commodity hardware’s 4% AFR). Snapshots to S3 essential.
“pre-warming” (initialise drive with “dd”) required on newly launched older gen drives (new generation EBS volumes have max performance on launch), or if volume launched from snapshot or S3 template.
instance store (ephemeral) - at provisioning time can add instance store to specific EC2 instances (some more than 1 instance store), provision from template in S3 (slow to provision as hydrated from S3 on read),
why? disk physically attached to host (fast, once provisioned), good for frequently modified data (buffer, cache), kinda free (baked into EC2’s hourly cost)
bad for: persistence (EBS, EFS, S3), relational DB storage (EBS, RDS), shared storage (EFS, S3, EBS)
cost: included as part of EC2 instance, just pay data transfer
gotchas: only some EC2 instances support it (which defines number and capacity too), can’t add after launch, can’t be stopped (only reboot/terminate), less durable to hardware/infrastructure failure, don’t show up in volumes menu, have to format file system, EC2 instance can’t tell which drive is what (need a convention), default hardware encryption (it’s direct attached to EC2 instance, could use 3rd party tools after boot to integrate with say KMS)
durability either through: replication or periodic copying to durable storage
performance
i2 (SSD-backed) for no-SQL databases (Cassandra, MongoDB), scale out transnational DB, data warehouse, Hadoop. 365K read IOPS, 315K write IOPS! Max 6.4 TiB
d2 (HDD dense storage) for massively parallel processing, data warehouse, MapReduce, log processing. 3.5 GiB/s read, 3.1 GiB/s write (at 2MiB I/O)! Max 48 TiB
backup database ontop of EBS by: i) database in hot backup mode, or ii) create read replica, iii) for RAID move to single slower EBS volume
AWS EFS - Elastic File Storage
elastic NFS storage shrink/expands automatically (don’t need to pre-provision EBS volume), can mount to multiple EC2 instances (unlike EBS), on-prem too, block based (not object like S3), like a NAS
good for: big data analytics, media processing, content management, web server
bad for: archive (Glacier), RDS storage (RDS, EC2/EBS since RDS typically locks per node), temp storage (EC2 local instance store)
performance = higher latency than EBS but higher throughput (multiple GB/sec vs 1 GB/sec; note EC2 instance throughput limits), bursty (50 MB/s per TB bursting to 100 MB/s; if >1TB, bursts 100 MB/s per TB), more durable than EBS (replicated across AZ)
could “dd” to create dummy data to increase consumed space so more throughput
performance modes: general purpose vs max I/O (designed for tens/hundreds/thousands of connect EC2 instances and trades higher latency for better I/O as latency amortized)
bad at lots of small files (too much latency)
for on-premise mounting, need Direct Connect (network intensive), or use EFS File Sync Agent
cost = #GB/mo flat rate and more expensive than EBS, e.g. in us-east-1 (N Virginia): $0.3/GB/mo vs $0.1 for gp2, $0.125 for io1 (io1 + 2.7 IOPS still cheaper)
but pay for what you use (vs EBS where you pay for whole volume) but 3x more $ than EBS
2 storage classes: standard and infrequent access (cheaper but has GB transfer costs), can transition with lifecycle policies (very basic: just transition after 14, 30, 60 or 90 days)
uses NFSv4 file system (not all features supported, no SMB), pay for storage you use, supports petabytes, can support concurrent NFS connections
how? i) create EFS, ii) create mount target within AZ (one target per AZ) and subnet (control security via security groups), iii) then mount to EC2 using FQDN (IP resolves to mount target in AZ, will failover to another AZ if AZ dies but EC2 data costs incurred)
security: i) IAM for create/delete/describe, ii) security groups for EC2 access, iii) unix-level user/group permissions (based on numeric ID so need to be careful user numeric IDs are same across EC2)
used for: big data (roman census model stuff, unstructured), web server (PHP server), content management, lift-and-shift migration to AWS
limits = designed to scale, PB-scale storage? 10 efs per region, 1 to 1000s concurrent EC2 instances, can’t boot EC2?
vs AWS FSx for Windows File Server: FSx is NTFS file system, Windows-native (EFS doesn’t support Windows since EFS agent is Linux only), integrate directly with AD (vs EFS which goes via IAM roles to AD)
can use “robocopy” command (robust copy - handles network issues, can mirror directories, can preserve NTFS ACLS)
vs AWS FSx Lustre for Linux: Lustre designed for parallel compute-intensive, short-lived workloads, objects transparently written to S3
Block-Store Performance shootout
EBS (io1)
EBS instance store
EFS
Max Size
16 TiB
Up to 60 TB (8 x 7500 GB NVMe SSD)
Petabyte-scale
MB/s
1.7 GB/s for EBS-optimized EC2 instance ceiling (e.g. c5 family)
Possibly 16 GB/s on i3en (this is either EBS network or all 8 NVMe SSDs driving)
10+ GB/s. Linearly burst 100 MB/s per TB, so 10 GB/s = 100 TB.
Latency
Low
Lowest
Average
Storage Asides
SAN vs NAS
SAN = Storage Array Network: block storage, typically attached to one node, fibre connected
NAS = Network Attached Storage: typically file system (NFS, Samba), connected to multiple nodes. Ethernet connected. Note EBS is physically NAS but closer to SAN
Network
VPC Virtual Private Cloud
virtual data center within your account, can be different regions, complete control of IP address ranges, subnets (e.g. public one for web servers, private one for backend), route tables between subnets, network gateways
is specific to region, but can span AZs, cost money to transfer data between AZs
Virtual Private Network (VPN) - connection between corporate datacenter and VPC so AWS extension of corporate datacenter
Hardware Virtual Private Network (HVPN) - AWS side is virtual private gateway with 2 endpoints for HA (at VPC level, not subnet). If multiple remote network (e.g. multiple branches) AWS VPN CloudHub
Software VPN - needs EC2 instance (so single point of failure)
Direct Connect - dedicated private connection, can have VPN over the top too
get default VPC immediately, has internet gateway attached (i.e. internet accessible), every ec2 instance has private+public IP, if deleted must contact AWS to restore
limits: only 5 vpcs per region (1 internet gateway and egress-only gateway per region - tied directly to VPC limit), 200 subnets per VPC, 5 IPv4 CIDR blocks per VPC, 1 IPv6 CIDR block per VPC, 50 customer gateways per region
multicast not supported
can’t really change CIDR blocks after creation. For VPC, can add/remove additional CIDR association (primary can’t be changed or disassociated). For subnet, can’t change.
Routing Tables - virtual router (decides where to route traffic) at subnet level
main routing table (default created by VPC can’t be deleted)
subnets implicitly linked to the default route table if not explicitly defined, changes take effect immediately
setup:
associated with subnet
destination is CIDR block of IPs to route mapped to a “target”
“local” = this VPC. CIDR for VPC will route to “local”. Can’t delete this route, auto-created for you
name of internet gateway, NAT, ENI, EC2 instance, vpc peering connection, virtual private gateway,
destination could also be prefix list for VPC PrivateLink Gateway Endpoints targeting peering endpoint
routing chosen is the most specific (e.g. 172.0.0.0/24 overrides 0.0.0.0/0)
route propagation allows virtual private gateway to propagate routes (so don’t need manual routes for VPN connections). Static routes higher priority over propagated routes. Can be disabled.
or use VPC Wizard. Pre-baked environments i) public subnet with IG, ii) public and private subnet with NAT gateway (main route table in private, follows least privilege), iii) VPN setups.
delete VPC: EC2 instances and VPC peering connections need to be terminated. Seems to detach VPN-related (customer gateway connections, virtual private gateway) so it can reused.
related topic: Border Gateway Protocol (BGP) internet routing protocol. Routers propagate info about network (availability/weighting) in case of internet path failure.
why? allows dynamic routing which is robust to failure. Alternative is static routes (AWS doesn’t support anything else)
e.g. if have Direct Connect and VPN to AWS, weight Direct Connect higher (highest weight wins). VPN used as a backup
required for Direct Connect. Optional for VPN
when customer gateway and virtual gateway connect (AWS only supports single hop even though BGP allows multi-hop), they become BGP neighbours (external BGP connection) and exchange route info.
AWS supports BGP community tagging (controls traffic and route preference - how far to propagate preferences; achieve load balancing)
setup: runs on TCP port 179 (and ephemeral ports)
Autonomous System Number (ASN) - unique ID number for an Autonomous System (AS - set of devices under one routing policy and single administration), in BGP network
creating it
vpc creation: name tag CIDR block (range of IPs); tenancy (dedicated is own hardware (also means instances under it are dedicated too), expensive).
usually use private range from RFC 1918, if use public, can’t route traffic to internet (need NAT)
10.0.0.0 - 10.255.255.255 (10/8 prefix)
172.16.0.0 - 172.31.255.255 (172.16/12 prefix)
192.168.0.0 - 192.168.255.255 (192.168/16 prefix)
subnets creation: name tag;
select the VPC
select AZ (subnet can only be in one AZ)
CIDR block - specify range of IPs in subnet (must be within VPC’s range). After creation will show number of assignable IPs (AWS reserves a few so less than theoretical). Can’t change subnet CIDR block after creation
internet gateway creation
name tag
by default not attached, have to attach to a VPC. VPC can only have one internet gateway
egress-only gateway is for ipv6, prevents inbound traffic as ipv6 is globally unique so public by default
elastic IP - public IP which can be (dis)associated with EC2 instances
small charge ($0.005 per hour) if allocated (so no one else can use it) but detached or connected to non-running EC2 instance
limit of 5 elastic IPs per account per region (note EC2 Classic has a separate limit of 5)
when EC2 instance stops, in EC2-VPC (but not EC2-Classic), elastic IP is still associated, may need to disassociate from EC2 console (not VPC console)
Security Groups - attach to EC2 instance, RDS, workspace. Instances can have multiple security groups, allows only
controls TCP, UDP, IDMP or custom protocol (presumably codes from here implied by REST API although this option not in CLI)
secure by default: if you create it, it will have no rules which means no inbound traffic allowed (but does have a rule to allow outbound)
source/destinations can be CIDR block, another security group (talk to peers in SG), or prefix list (for outbound)
rule evaluation: all rules evaluated
NACL network access control list - evaluates rules in order and first matching rule (lowest #first, multiples of 100 good to insert rules in between), overrides security groups, default created allows all, but new ones are deny all
can be associated with multiple subnets, but subnet can only have one ACL
vs security groups:
security group allows only and is on instance level (NACL at subnet level), stateful (return traffic always allowed), order not important (everything evaluated).
NACL at subnet level and can also deny, stateless (return traffic not allowed), order important (evaluated lowest # first to 32K), is auto-applied to instances in subnet (in case someone forgets to add a restrictive security group)
flow logs - send network interface logs to CloudWatch to monitor why traffic doesn’t reach an instance (e.g. find restricting security policy) or monitor inbound traffic
applies to NIC, subnet or VPC
doesn’t capture everything: calls to AWS DNS, Windows AWS license renewal, 169.254.169.254, DHCP, traffic to VPC’s reserved IPs. Doesn’t capture content, so more routing debugging?
DNS - within vpc ec2 hostname resolves to private IP but publicly resolves to public IP address, no hostnames for IPv6
DHCP Options - pass config to EC2 instance. Immutable after creation. But can create new one and associate VPC with it
domain-name-servers = DNS servers, up to 4 but OS may impose limit
domain-name = completes unqualified hostnames. If using “AmazonProvidedDNS” DNS server will be something like “ap-northeast-1.compute.internal”
ntp-servers = time server, Amazon’s 169.254.169.123
Reserved IP addresses: Not specific IPs, but the sequence (0: network address; +1 VPC router, +2 (DNS); +3 (reserved for future use by AWS; last usually for broadcast which isn’t supported anyway). I.e. you lose 5 IP addresses per subnet.
E.g. 10.0.0.0/24: 10.0.0.0, 10.0.0.1, 10.0.0.2, 10.0.0.3, 10.0.0.255
E.g. 192.168.8.16/28: 192.168.18.16, 192.168.18.17, 192.168.18.18, 192.168.18.19, 192.168.8.31
IPv6
addresses IPv4 (32-bit) address exhaustion with 128 bits formatted into 8 x 16-bit groups.
AWS assigns a fixed /56 to VPC (can’t choose range since IPv6 public) and a fixed /64 to subnets (vs IPv4 where you can specify width).
IPv6 addresses are always public (have to use egress-only internet gateway to create effectively private IPv6 subnets)
Ephemeral ports - dynamic, short-lived ports used in IP communication. Different OSes use different ranges (Windows XP: 1025-5000, Linux > 32768, suggested to use above 49152). Chosen by the client. In practice for NACL, need 1024-65535.
security group impacts - security groups are stateful so probably OK
Network ACL - must specify both inbound and outbound so NACL for web server needs to allow out on ephemeral port. Since this is quite wide, DENY rules need to be earlier (lower #). For different OSes could have different NACLs to restrict surface area.
Data transfer costs. Typically:
data in is free
out to internet is most expensive
public IPv4/6 charged at both inbound and outbound
out to other regions cost varies by region but cheaper than internet (as AWS has own backbone connecting regions)
for services that rely on EC2 instances (i.e. placement in a subnet, e.g EC2, RDS, Redshift, ENI, DAX)
between AZ incur costs inbound and outbound (same/cheaper than out to regions)
within AZ is free
for services that don’t have a subnet (e.g. DynamoDB, S3, Kinesis, SQS) transfer within region is free
using internet costs money
AWS Global Accelerator - static IP address for your app (NLB, ALB, EC2 instance, EIP) letting clients connect to closest edge location (same IP but could go to different regions) and traverse AWS backbone for performance/reliability
AWS Firewall Manager - centrally configure firewall across accounts (AWS Organization) and apps
apply centrally managed security groups to sets of EC2 and ENI resources. Optionally automatically remediate non-compliant config
helps with compliance: dashboard of compliance status, SNS notification on compliance breach, creates AWS Config Rules (check WAF not disabled), find unused/redundant security policies
Tips
don’t use VPN, NAT or IG as these can be single points of failure
re:Invent 2017 “another billion flows”
“flow”: 5-tuple of protocol, source IP/port, dest IP/port. Stateful, remembers SEQ and ACK numbers. UDP doesn’t have SEQ/ACK but has Datagram ID
NAT and NLB same - have multiple instances/IPs behind single IP, just in different directions (which allows NLB to present correct source IP if target is instance)
Internet Access Options
For EC2 instance to be internet accessible, being inside public subnet (subnet with route to internet, e.g. IG) not enough. Needs public IP or Elastic IP, routing table (to send 0.0.0.0/0 traffic to IG)
Internet Gateway - VPC component allowing access to the internet. Doesn’t have IP itself. Supports IPv4/6.
why? route table target for internet-bound traffic, does NAT for EC2 instances with public IP (not private)
Egress-Only Internet Gateway - outbound to internet only.
why? for IPv6 which are public. Using EOIG creates private IPv6 subnet. Don’t need NAT
NAT instance (network address translation) - is an EC2 instance so still uses IG
why? allowing DB server (on private subset) to access internet for yum update. It’s the IP when your backend makes calls out
setup:
create NAT EC2 instance, community “NAT” AMI (t2.micro could be a bottleneck), in public subnet, disable source/dest check (disable source/destination check: check enforces that EC2 instance is either source or destination, which doesn’t apply to a NAT)
new security group for NAT instance with: outbound all traffic; inbound on 80/443 from CIDR block subnet contain DB server
setup private subnet’s route table to route internet-bound traffic to NAT
on delete, probably need to delete the EC2 instance, or at least the network interface first, then it can be released (i.e. deleted)
vs bastion (jump box) which is highly secure vs making all EC2 instances highly secure. NAT instance could be a bastion
cost per hour and data transfer. Will be cheaper than NAT instance (it’s optimised)
setup:
assign NAT gateway to public subnet
AZs can share NAT gateway but one NAT gateway per AZ better in case AZ dies, one NAT gateway per subnet for high throughput. Which means different routes per AZ
associate Elastic IP (can’t be changed/disassociated)
update routing table for private subnet to send internet traffic to NAT gateway, and also NAT gateway to IG.
can’t be used to access VPC peering, VPN or direct connect
vs NAT instance: NAT gateway is redundant, salable (up to 45 Gbps), managed (recommended)
but: can’t detach elastic IP, can’t assign security group (use NACL), can’t be bastion server, manually setup to support port forwarding (forward internet traffic on specific port to private subnet)
vs Egress-Only Internet Gateway: IG supports IPv6 but NAT gateway doesn’t
AWS Managed VPN - managed IPSec VPN connection, over internet (“AWS Site-to-Site VPN”)
Why? quick/simple if you have VPN hardware on your side already, highly available on AWS side of VPN
Uses: VPN to VPC, redundancy for Direct Connect or VPC VPN
Bad: uses internet connection, customer-side VPN not HA, no IPv6
Setup: i) designate customer appliance to be “customer gateway” (e.g. router), ii) create “virtual private gateway” and VPN on AWS (which can generate config (e.g. for CISCO), iii) configure customer gateway with config, iv) generate traffic on customer side to bring up tunnel, v) optionally configure BGP routing (device must terminate both BGP and IPSec)
make customer redundant by having two customer gateways pointing to same AWS virtual private gateway
Which subnet for VPN? Wizard creates a VPN-only subnet which is a private subnet. Routes for local VPC and default route goes via virtual private gateway (so that all internet/on-prem traffic goes via corporate firewall/policy). Coud change to use NAT for internet to avoid backhauling.
AWS Direct Connect - dedicated network connection over private line (no ISP) straight to AWS’s network
why? Need high bandwidth (up to 10 Gbps), increased reliability/consistency, lower latency, most flexible routing (device must support BGP), could be cheaper (e.g. avoid AWS costs out to internet), can connect multiple VPCs
connect companies data center to AWS, possibly dedicated co-location.
cons: Additional telecom/hosting provider relationships, network circuit (need hardware near AWS), long time to setup
still single point of failure, ideally have another Direct Connect with another provider
flavours:
Dedicated Connection - physical Ethernet port dedicated to you, 1-10Gbps supporting up to 50 public/private virtual interfaces, or 1 transit gateway. Offered from AWS only
Hosted Connection - offered via approved AWS partner, 50Mbps to 10Gbps. Each hosted connection <500Mbps supports 1 only of public/private/transit gateway. AWS ensures not oversubscribed.
Hosted Virtual Interface (Hosted VIF) - offered via approved AWS partner, VIF used by you but ontop of partner’s Dedicated Connection AWS account. AWS can’t ensure not oversubscribed so not recommended.
how: uses VLAN trunking 802.1Q. Partitioned into multiple Virtual Interfaces (VIF) to connect to:
private VIF: connect to AWS services via private endpoints, e.g. VPC, private EC2 IPs, Storage Gateway
public VIF: using public endpoints, e.g. S3, DynamoDB, CloudFront (i.e. using public IP)
can use link aggregation group (LAG) to aggregate multiple connections into one large connection
Direct Connect Gateway: traditionally Direct Connect only connects one region (VIF in one location) and multi-region more complex. Direct Connect Gateway sites between Direct Connect location and VPCs and connects multiple VPCs and regions minimising
Direct Connect plus VPN - IPSec VPN connection over private lines
why? Adds encryption/security over Direct Connect (end-to-end). Encryption prevents provider, employees snooping
AWS VPN CloudHub - connect locations in hub (virtual private gateway) and spoke manner using AWS virtual private gateway (VPN)
like DIY Multiprotocol Label Switching (MPLS) network over IPSec VPN and internet.
spokes connect to each other via AWS
why? connect remote offices to AWS for: backup, to other offices, or even primary WAN access
good: reuses internet connection so fast to setup, supports BGP (e.g. use MPLS first then CloudHub VPN as backup)
bad: uses internet (unreliable), no inbuilt redundancy, max 1 VPC.
but could combine with Direct Connect or multiple redundant VPN options
how? On AWS side: Virtual Private Gateway associated with or without VPC. Create site-to-site VPNs to remote offices (need redundancy), each advertising its BGP routes
restrictions: can be used without VPC, each spoke must not have overlapping CIDR ranges. Each spoke has unique BGP ASN (autonomous system number) and advertises BPB prefixes (routes) to each BGP peer
Software VPN - you provide VPN endpoint and software, install it onto EC2 (e.g. AMI, software options in Marketplace) and manage both sides
good: ultimate flexibility (e.g. OpenVPN not supported by AWS), for compliance reasons
cons: you need to ensure redundancy at all points
Transit VPC - (not to be confused with “Transit Gateway”) connect geographically diverse VPCs to create common transit/hub area (VPC), is cross-account, cross-region (have transit VPC in each with on-prem connecting to both)
what: software VPN with a hub VPC.
why? support global network of different cloud providers, hybrid networks (flexible/control, NAT for conflicting IP ranges, packet filtering/inspection, shared connectivity for cross AWS account usage) but also AWS managed parts. Hub minimises number of VPN connections required (avoid nCr number of connections)
cons: you design redundancy, for cross-account needs Direct Connect Gateway in each account.
how: Cisco/Juniper Networks/Riverbed have products which work with AWS VPC
uses BGP with route propagation
central transit VPC controls what routes advertised
e.g. use transit for internet, then make it advertise default route to IG, which propagates to spoke VPC as default route to VGW
detached virtual gateway (e.g. customer network to Direct Connect to VGW that isn’t in a VPC (i.e. detached), then to transit gateway. Why? Makes customer network look like another spoke to transit VPC to standardise routing/automation
but traffic unencrypted after terminated in VGW
automating adding spokes: CloudFormation tags VGW, Lambda watching for tags and updates VPN config in S3. Another Lambda watching S3 changes will distribute to transit VPC to update. Can extend for cross-account. See here.
limits: route table limits at 100 VPCs, so need another transit VPC
vs VPC peering: can’t do transitive routing, traffic has to terminate/originate on a network device in VPC
e.g. above example of using transit VPC for internet:
in VPC peering, default route to peering connection fails at transit VPC (not source/destination at this point)
fix with transit VPC with VPN, transit VPC terminates VPN tunnel so is source within VPC
vs AWS VPN CloudHub: Transit VPC to connect multiple VPCs
vs Transit Gateway - Transit Gateway is newer, managed version of transit VPC. Can connect to VPC, VPN or Direct Connect Gateway that are within same region. Cross-account.
Transit VPC - for cross-region.
AWS Client VPN - for clients/users to connect using OpenVPN. From on-premise or anywhere on internet. You setup VPN endpoint for each subnet (recommend two public subnets (need IG route) for HA) to expose and clients connect to that
security: uses TLS, not IPSec. Authentication via AWS Directory Service only (which can integrate with on-prem AD and MFA). Inherits security group of subnet
pricing: per subnet that has VPN endpoint association per hour, per client connected to endpoint per hour.
gotchas: not HIPAA/FIPS compliant
vs Site-to-Site VPN connection: client for users into AWS, site-to-site for connecting AWS to office/on-prem
Multiprotocol Label Switching (MPLS): routing technique based on label instead of inspecting the packet and doing complex routing table lookup (OSI layer 2.5, labels are protocol agnostic, scalable)
Gotchas:
MTU (max transmission unit) is max size of packet. Typically 1500 for Ethernet and EC2. Within VPC possible to support jumbo frames (9001 MTU) but only 1500 for Internet Gateway, VPC Peering. Will cause fragmentation.
Connecting VPC To VPC
Similar to Organisation to AWS VPC section above (except whitepaper has no Direct Connect plus VPN, Transit VPC, and CloudHub):
why not just VPC peering? Inter-region only available late 2017
software VPN - cross-region and you want to manage everything
software to AWS managed VPN - cross-region and VPN with some HA
managed VPN - connect multiple VPCs and customer site
Direct Connect - if you’re already Direct Connect and need cross-region, or perhaps cross-account in same datacenter.
VPC Peering - AWS managed network connectivity between two VPCs using direct network route and private IPs (not internet, behave as if in the same network)
why? multiple VPCs that need to talk to each other (even in different AWS accounts, e.g. PROD vs DEV in different accounts)
without peering: EC2 instances within region uses AWS backbone; EC2 between regions not guaranteed to use backbone
limits = Only star configuration (central VPC with 4 peers), not transitive, can’t have over overlapping CIDR block range, max 50 active peers (25 outstanding VPC peering connection requests)
“inter-region” VPC connects VPCs from different regions (doesn’t use internet, only AWS backbone)
how: one VPC makes peering request, accepter has 1 week to accept (will auto fail if CIDR overlaps)
likely need security group changes, NACL update, route tables so traffic of peer VPC’s IP goes to that VPC. May need to enable DNS if either VPC addresses by public DNS hostname
AWS PrivateLink - AWS provided connectivity between VPCs and/or AWS services use interface “endpoints” (VPC endpoints)
why? keeps private subnet truly private by using AWS backbone to access services (i.e. more granular than VPC peering) instead of internet (normally need NAT/IG for private subnet to access AWS services). Reliable since using AWS backbone. If each microservice in own VPC, minimises blast radius.
e.g. API provider on Marketplace provides services, can connect to it without internet (i.e. consume surface in other VPC)
piggy-backs inter-region VPC peering for services in other regions
how? create endpoint for AWS/Marketplace service in all need subnets
flavours of endpoints:
interface endpoint: ENI (security groups here) with private IP (ideally deploy to multiple AZs for HA), using DNS name (to route to private path). Used by API Gateway, CloudFormation, CloudWatch, service in marketplace, etc. Secured by security group
gateway endpoint: is target for a specific route, uses prefix list in route table to redirect (like intercept in route table that has prebaked config). Used by S3 and DynamoDB only Secured by PrivateLink (VPC endpoint) policy (like IAM policy syntax, e.g. only allow it to be accessed via endpoint). Could be used for cost reduction as no per-hour charge like interface endpoint (could save on inter-region transfer)
S3 gateway endpoint - continue to use AWS DNS names but changes to private IP (inflight connections might break). Could restrict to specific buckets here. Gotchas: doesn’t support cross-region requests, can’t use aws:sourceIP in IAM/bucket policy (two VPCs could have same private IP range; instead use route policies in VPC to control what uses endpoint or bucket policy to allow by VPC ID)
DynamoDB gateway endpoint - continue to use AWS DNS names. Could restrict to specific tables or read-only in endpoint policy. Gotchas: no cross-region requests, no DynamoDB streams access
VPC endpoint service - power own application with PrivateLink. Put your app behind NLB. Need multiple AZs for HA
VPC endpoint policy - resource IAM policy controls who can access the endpoint.
gotchas: within region only. AZs in one account might not be same as your account.
re:Invent: Connecting Many VPCs
fewer larger VPCs (less admin, tighter account control via IAM, SG, routing, allocating billing/ownership tricky, big blast radius). Good for policy/IAM focus
vs many small VPCs (more account/infrastructure complexity, tighter provisioning standardisation/automation, easier billing, smaller radius except for shared components). Good for infrastructure/networking focus
bad: lots of connections (VPN could be slow to provision, e.g. 30-day turnaround), nCr relationship
problems: 5-15 VPCs need automation, 50-100 VPCs have VIF port limit and route limits, 125 max VPC peer limit, 200+ need internet (e.g. PrivateLink, bastion, certs/keys over passwords)
Firewalls: Why? If you have terrible processes/operations, you’re just moving those to AWS and losing benefit. Probably doesn’t scale. Perhaps compliant reasons
Shared services: e.g. for AD, DevOps tooling.
typically: uses VPC peering but: CIDR overlap issue, full VPC connectivity blunt (overkill), hard limit of 125 peers
instead use PrivateLink, service advertises API with IP, but appear as local IP in consumer VPC. Unidirectional (consumer makes requests so more secure)
could put inside transit VPC but simpler routing if just another VPC
inter-region connectivity: i) VPC peering (1:1 VPC peering), ii) Direct Connect Gateway (high performance, must be provisioned), iii) cross-region VPN with transit VPC (full control, complex management)
Route 53
DNS (domain name system) which runs on port 53. Not region-specific (i.e. global)
record set - tells DNS how to route, e.g. to IP, route email to mail server, route subdomain to a different host, can apply routing policy (below)
top level .com, .au; 2nd-level the “.com” in “.com.au”. See more
domain register is an authority that can assign domains under top-level domain(s) with InterNIC (a service of ICANN)
TTL (time to live) cache time in seconds on either resolving server or users PC
DNS record types
NS (name server) record are used by top-level domain server to content servers
A record translates name to IPv4 (AAAA for IPv6), i.e. which IP does https://acloud.guru/ go to?
mapping IP address can have failover implications. Need to either i) use DNS provider that has CNAME-like functional at apex (Route 53’s alias record), or subdomain redirection
CNAME (canonical name) point one domain to another (i.e. alias, e.g. m.acloud.guru and mobile.acloud.guru to the same place. Map to another domain (not an IP, e.g. stubhubz.andrewcho.xzy alias of d331p134cpyhks.cloudfront.net)
Can’t be used for “naked”/“root”/“bare” record (zone apex record), i.e. without www http://acloud.guru, use A record or alias instead
“Alias” record (Route 53-specific) map resource record sets in your hosted zone to AWS resource (ELB, CloudFront, S3 static web site, Beanstalk, VPC interface endpoint) similar to CNAME, but alias record generally favoured
issue with A record needs IPv4, but can’t get it for ELB (it changes), hence alias record. Route 53 automatically will change it when record set (ELB) changes
doesn’t cost money if , allows naked domain mapping to ELB
DNSKEY record - not supported by Route53. Cryptographically check whether DNS record came from correct DNS server.
MX for mail exchange
SOA (start of authority) record, who is admin, version info, TTL info
TXT record - many purposes, e.g. SFP (sender policy framework) to prevent email forgery by specifying which IPs can send mail for the domain
AAAA IPv6 host address
SRV service record - defines hostname and port of a service. Used by DNS Service Discovery, Kerberos, LDAP, Puppet.
hosted zone is public (internet accessible) or private (responds to VPC only for private routing within VPC)
private vpc also allows 100 VPCs from different accounts to be under the same private hosted zone, can use it to block access to a DNS name
routing policies, how Route 53 responds to queries:
flavours:
simple - default, for single resource for a given function on domain (round robin?)
weighted - multiple resources that do the same function, weight (0 disable to 255. Relative). E.g. define 10% traffic to west-1, 90% to west-2. Could be used for A/B testing, blue-green. Not immediate, redirects over the course of the day as DNS could be cached
lab notes: create two record sets with the same A record, both an alias to different ELBs (e.g. London vs Sydney ELB)
mixing routing policies: want this one over instances to implement A/B or blue-green
latency - based on lowest latency for end user. Have to tell it what region your resource is in. “Set ID” differentiates records with the same name and type (e.g. My London Datacenter). Costs extra
vs cloudfront - cloudfront is a cache (doesn’t help if content not cacheable) but Route 53 is compute resources
failover - active/passive setup (e.g. disaster recovery), define primary (active) and secondary (passive failover) records
geolocation - route based on geographic location of user (continent, country, US state). E.g. localisation/data sovereignty/enforce distribution rights
costs extra money, if overlapping rules smallest area wins, need default in case can’t determine location of user
geoproximity - route based on geographic distance between resources and users. Set “bias” (-99 to 99) on regions to make their service area bigger/smaller
requires “traffic flow” (costs extra)
vs latency policy: geoproximity supports non-AWS resources (set latitude-longitude), force shift of traffic to a region (e.g. make us-west-2 higher priority because it’s cheaper even though us-west-1 closer)
multivalue answer = return multiple IPs which can help (but not replace) high availability and load balancing. Clients should be smart to use next IP if one is bad
responds with up to 8 random healthy records. If all unhealthy, respond with 8 unhealthy records
vs ELB: ELB has sticky session, doesn’t suffer from DNS cache issues
health checks: not against DNS record but against individual servers. Healthy is 18% of checkers agree (many checkers across the globe).
all records need to have same name (e.g. example.com, primary.example.com), type (A, CNAME, alias), and routing policy
can also check against another health check or CloudWatch alarm
if unhealthy check, Route 53 won’t respond with that records IP
costs money
traffic flow - costs $50 per policy record per month. Visual editor to create traffic policy. Can mix policies together
transition examples:
using 3rd party domain registrar with Route 53 - replicate DNS records in Route 53, update domain register NS entries to point to Route 53
map subdomain to Route 53 without migrating parent - add Route 53’s NS servers to a subdomain in parent’s subdomain
transition to latency based routing - i) set up new copy-www.domain.com using same type (A) and value (W.W.W.W) as original, ii) update existing www.domain.com A record to be weighted alias record with target copy-www.domain.com of weight 100, iii) setup latency records, each: same name (www-lbr.domain.com) and type (A). Test this. iv) create another alias of www.domain.com with alias target www-lbr.domain.com and weight 1, v) test and ramp up weight to shift to www-lbr.domain.com
public (internet) vs private (DNS for internal, within VPC) hosted domains
limits = max 50 domains you can manage, max 500 hosted zones and 10,000 resource record sets per hosted zone, s3 bucket must match record set name (see doc)
AWS CloudFront
Content delivery service (CDS) in a content delivery network (CDN) to distribute web pages/video/APIs to users based on user location, origin of page, and content.
concepts
edge location = location where content is cached, not a region or availability zone
origin = where CloudFront will get files from (S3 bucket, EC2, ELB, Route 53, MediaStore, or non-AWS server, but not IP), content can be static, dynamic, streaming, interactive
distribution = name of CDN consisting of collection of edge locations, users hit edge location first, if cache miss hit origin, caches have TTL
web distribution = used web sites (HTTP(S)) also supports media streaming
RTMP distribution (real-time messaging protocol) = used for streaming Adobe Media Server. CloudFront needs media + player (JW Player, Adobe Flash)
video on demand with AWS Elemental MediaConvert (e.g. transcode video from S3 bucket)
live streaming video with AWS Media Services (AWS Elemental MediaLive encodes video into different formats for viewers; or AWS Elemental MediaStore as origin)
TTL of “0” will still use CloudFront for caching, it just asks Origin each time. Good for DDoS protection, moving edge location closer to user
cost = #GB in (scales down after 1st 10GB/mo), #GB out flat rate, #requests (per 10K requests)
can be apex record (“A” which normally requires an IP) via Route 53’s alias record
can gzip compress content
can serve private content using a few methods:
AWS assumes pattern of: access to origin only via CloudFront (configure origin to block non-CloudFront access)
for S3 buckets (not websites), can use Origin Access Identity (OAI) described in S3 section. Can be used with signed URLs
signed URLs or cookie - application sends viewer signed (private/public key pair) URL/cookie using. CloudFront verifies.
setup:
specify AWS accounts that can sign URLs/cookies (trusted signers) in distribution
application verifies user and distributes signed URL (or 3 cookies) with either canned/custom policy (JSON policy with expiry date of URL; custom allows reuse between files, valid start date, allowed IP of users)
browser invokes signed URL (or URL if cookie method) and CloudFront uses public key to verify, validate policy, forwarding to origin if valid
could have custom error page here
URL or cookie? URL for specific files, RTMP. Cookie for multiple files (e.g. playlist of video files), don’t want URLs to change. CloudFront prefers signed URL if both sent
URL format different from S3. CloudFront allows you to specify the key-pair to use, S3 doesn’t.
custom headers for custom origins. Can be used with signed URLs. Setup: i) configure CloudFront to send “origin custom header” with some magic value, ii) configure security policy to require HTTPS for viewer and origin to use same protocol as viewer (ensures encryption), iii) update origin to only accept requests with magic header.
Eh? If someone else specifies it and bypasses CloudFront, we’re screwed. AWS recommends rotating it.
caching
invalidating cache costs money. Methods: i) file-versioning in name (most reliable in case of client caches, free, allows rollback), ii) wait for cache expiry (free), iii) via console, iv) via CloudFront API, v) 3rd party tool
configure caching by query parameters, cookies (CloudFront caches by name+value), or headers to be none (forward none to origin), whitelist (only forward those that will generate unique results), or all (worst cache hit ratio). Could use Lambda@Edge to normalise (ordering, case).
expiration typically origin’s “Cache-Control max-age”, “Cache-Control s-max-age” or “Expires” header (“Cache-Control” and “Pragma” can’t force request out to origin). Useful for controlling individual files.
or if you want to apply to all files, override it with minimum, maximum and default TTL. Complicated
security
supports SNI (server name indication) - during TLS handshake client request by IP (handshake before HTTP bit) so if there multiple web sites with different certs on same IP, server doesn’t know which cert to return and common name mismatch may occur). SNI includes common name in handshake.
if clients don’t support it, CloudFront will dedicate IP to distribution for $600/mo
certificates
default CloudFront certificate (*.cloudfront.net)
or custom domain SSL cert (either generated by AWS Certificate Manager. For multi-region deployments, would need to deploy cert into each region (makes rotation sucky)
field-level encryption - provide CloudFront with public cert and it can encrypt specified POST parameters for end-to-end encryption. Origin decrypts using AWS Encryption SDK
security policy - specify minimum supported SSL/TLS for viewers: SSLv3, TLSv1.0, TLSv1_2016 (recommend, like TLSv1 but with less ciphers), TLSv1.1, TLSv1.2
can add WAF and geo-restrictions
encryption in transit - can enforce HTTPS on viewer separately to origin. CloudFront request to origin involves SSL handshake
failover - origin group (group of origins) CloudFront will try if one fails
create web distribution form
origin id = distinguish multiple origins in a CDN
restrict bucket access = prevent access via s3 URL, must be via CloudFront URL (e.g. URL signed content)
allowed HTTP methods - also a put/post/patch/delete will allow upload to edge location and will sync to origin
restrict viewer access (use signed urls or signed cookies) - e.g. only allow access to identified users
price class - more locations gives better performance, us/Europe + Asia + all
alternate domain names - don’t use ugly AWS domain names, e.g. d2fbkyd1m6nzr1.cloudfront.net
default root object - web server index.html files
logging, can go into another S3 bucket, or another account’s S3 bucket
edit
behaviors - path pattern to redirect HTTP to HTTPS
error pages - custom page on error
geo-restrictions - whitelist/blacklist based on countries, can’t white+black-list
invalidate - without waiting for TTL
limits: default 24 hours TTL, 3K invalidation objects total concurrent
method for routing IP address and IP traffic (layer 3 network traffic), consists of two parts:
most significant bits: network address (or network prefix or network block), which identifies a whole network or subnet,
least significant set forms the host identifier, which specifies a particular interface of a host on that network
CIDR notation, suffix indicates bits in prefix
32 - #bits in prefix = #bits of addresses, e.g. /32 = 1 address, /22 = 1024 addresses
192.168.100.14/24 represents the IPv4 address 192.168.100.14 and its associated routing prefix 192.168.100.0, or equivalently, its subnet mask 255.255.255.0, which has 24 leading 1-bits.
192.168.100.0/22 represents the IPv4 address 192.168.100.0 to 192.168.103.255
IP routing table
router uses CIDR notation to define where to route traffic, in principle consists of:
destination - usually in CIDR notification to keep tables small, e.g. entry for 192.168.0.0/24 will detail how to route traffic from 192.168.0 to 192.168.255
target/gateway/next hop - next hop on where to route the traffic
cost/metric - value defining how “expensive” this route is (e.g. localhost over 127.0.0.1 is cheaper than going over NIC)
may also include interface (i.e. which NIC to use)
OSI Model - Open Systems Interconnection model. Mnemonic “please do not throw sausage pizza away”
transport - e.g. TCP (connection-based, stateful, reliable with acknowledgements - used for web/file transfer), UDP (connectionless, stateless, unreliable, no ACKs, no sequencing, package drops are OK - used for streaming*/DNS)
connectionless - nodes can send without prior arrangement/setup (i.e. sender not even sure recipient ready to receive). Messages usually called “datagrams”. Usually stateless
stateless - no way to know where you are in a conversation
*really use UDP for streaming video? UDP has issues (have to build own application layer ontop of it, could be blocked by firewalls). Likely TCP will be used (especially in browser over HTTP, content isn’t strictly “live” and cached via CDN e.g. YouTube/Netflix). UDP still useful for video conferencing (enterprise app)
network - close to “internet layer”, end to end. IPv4/6, ARP (address resolution protocol - translate internet layer IP to MAC address), IPSec, ICMP (Internet Control Message Protocol, used by network devices to communicate about network health, e.g. ping, traceroute)
data - close to “data” link layer, point to point. MAC (media access control). Address uniquely identifies hardware in a network
physical - copper/fiber cable, wifi radio signal
DDOS (Distributed Denial of Service) attack
attacks
amplification/reflection attack - amplification (means small data in, lots of data out), reflection (spoof packet, e.g. MONLIST command, to send response to victim)
SYN flood, keep half-open TCP server connections open (SYN part of 3-way-handshake)
cache poisoning
mitigations:
scale to absorb attack: auto-scaling groups (or trigger on CloudWatch metrics), using CloudFront (serve content from edge locations closer to user not attacker, even in-front of dynamic content still useful), API gateway (rate limiting, combine with CloudFront) Route 53, S3 static hosting
ALB stops SYN and UDP replication attacks, but NLB will forward everything
safeguard exposed resources: route 53 (restrict geographies that can talk to us), AWS WAF, AWS Shield:
AWS DDOS protection. Standard is free (on Route 53, CloudFront; get free SYN flood protection)
can pay for “advanced” enhanced protection: detect on ELB, EIP, extra CloudWatch metrics, reduced data out cost, access to specialised response team, free WAF - team can recommend WAF rules
learn normal behaviour: AWS GuardDuty (machine learning over CloudTrail, VPC Flow Logs, DNS logs, and threat-intelligence from CrowdStrike/Proofpoint to detect threats and automate responses), CloudWatch
Compute
EC2 - Elastic Cloud Compute
Virtual server, SLA of 99.95% available
pricing models:
on-demand - fixed rate by hour, no commitment, spiky, can’t be interrupted, dev/test environments, supplement reserved capacity
good for: spiky hard-to-predict demand, time period between 6 hours and 1 year term. Supplement reserved capacity (e.g Black Friday sales), or for development/test environments
reserved - significant discount on hourly charge, 1 year or 3 year terms, stable/predictable capacity requirements (only way to guarantee capacity), can mix/match with on-demand. Can change AZ, networking type
good for: steady state predictable usage (e.g. production envs)
Automatically used if you launch an instance matching RI attributes. Can be shared across accounts.
scope reservation to a particular AZ (“zonal” guarantees capacity) or all AZs within a region (“regional” with instance size flexibility but not guaranteed capacity). Can change between zonal and regional.
Different plans:
standard (cheapest, scale up within family if Linux/UNIX, regional RI with default (shared) tenancy - no Windows/Redhat/SUSE)
convertible (can change up as long as greater value, e.g. instance family, OS, tenancy, payment options). Can take advantage of price drops. Can’t be sold on RI marketplace.
scheduled (on a schedule)
pick instance type (e.g. m1.small), platform (Linux), AZ, tenancy (shared vs dedicated), and term (1 or 3 years)
also host tenancy (on hardware you control, e.g. for hardware-based licensing) whereas default is simpler (just client-specific hardware). Can switch from host to dedicated by stopping instance. Switch to default by snapshot, AMI move, and launch
marketplace exists to buy cheaper/shorter terms
on-demand capacity reservation - zonal reservation with no billing discount (regional RIs apply discount but zonal don’t) but don’t need to sign up for minimum 1-year term
spot - bid excess capacity AWS maintains to serve on-demand (prices reflect long-term trends/capacity), for flexible start/end times, will terminate your instances (charged partial hours if you terminate, not charged if AWS terminates)
why? good for big-computer jobs (timed for availability zone, day of week, to save 50-94% $) to be cost-effective
Spot Fleet - specify capacity (instance type or vCPUs/memory) and it finds best AZs (if you don’t specify) and spot instances to fulfil it. Launch all at once.
good for: HPC, Hadoop workflow, batch-processing job.
can mix in on-demand instances (does use reserved instance capacity if applicable). I.e. core minimum capacity with spot instances for “application auto-scaling”)
vs “EC2 Fleet” - EC2 fleet is newer. Specify capacity (instance type, VPUs, or application-oriented unit) and AWS will launch it, scale for you (push button fleet deploy). Can mix on-demand, reserved and spot instances.
but: EBS-backed spot instances can’t be stopped (terminate/reboot/interrupt only)
launch group: launch all or nothing (terminates all or nothing too). Decreases chance of launch success, increases chance of interruption (each AZ has its own capacity)
interruption when price exceeds bid price, spot instances scarified to fulfil on-demand requests
should get 2-minute notice via CloudWatch event
behaviour: terminate, stop, stop-hibernate (need to kick off agent on instance start, doesn’t get interruption notice as agent reacts immediately) which can resume when price drops
request types: one-time (“fill and kill”), persistent (active until expired/cancelled even if fulfilled, will resume/start new instance when price drops).
durations: interrupted, 1-6 hours (pay premium over spot but generally won’t be interrupted, only if EC2 capacity issues).
need to flexible with instance type, region/AZ (pricing at AZ-level), to get cheapest price
cost
#hours up,
GB of data: in/out from public/elastic IP (internet free), in from AWS services in other AZ
families (DR MC GIFT PX, AWS doco, shouldn’t need to know this for the exam):
general
T2/T3 - lowest cost, general purpose, web server, small DB, burstable
provide baseline performance per vCPU which is different depending on size which also affects CPU credit accumulation
one CPU credit = 100% utilisation for 1 minute
max credits in 24 hour period = vCPU * baseline percentage * 24 hour period * 60 min, e.g. t3.medium is 2 vCPU at 20% = 2 * 20% * 24 hr * 60 min = 576 max credits
T2 has launch credits to burst (50% for one hour, i.e. 30 credits per vCPU, but only 100 T2 instances launched get this over 24-hour rolling period)
T3 no launch credits but can use “unlimited bursting” by default (so can T2 but not default option). Charge per extra vCPU above (CPU utilisation averaged over rolling 24-hour period so can burst longer than burst credits allowed if rest of 24-hour period quiet)
G3 - graphics-intensive, can stream workstation, video encoding, machine learning, 3D app streaming, 3D rendering
F1 - field programmable gate arrays (FPGA), custom hardware acceleration for financial analytics, real-time video processing, big data search analysis/security
storage optimised
I3 - high speed storage, GB of SSD, low-latency, high random I/O, no-SQL (Cassandra/MongoDb), data warehousing, scaled out transnational DBs, elasticsearch
D2 - dense storage, TB of HDD, file server, MPP (massively parallel processing) data warehousing, MapReduce/Hadoop, distributed file system
instance type features
enhanced networking adapter (ENA) - SR-IOV (Single Root I/O Virtualization) different type of virtualization with better latency/bandwidth at lower CPU. Supported on some EC2 instance types. Free. Might require special drivers.
Different flavours (what you get depends on EC2 instance)
Intel 82599 VF Interface at 10 Gbps
Elastic Network Adapter at 100 Gpbs
vs Elastic Fabric Adapter (EFA) - special type of ENA for HPC (cluster computing) to accelerate traffic within a subnet by passing OS. Limits: subnet only, seems Linux only.
EBS-optimized instances - instance can fully use its IOPs, dedicated throughput, some types are EBS-optimized by default
instance recovery features - action in response to alarms:
reboot - restart instance (doesn’t migrate the EC2 instance, just restarts it)
creating it
ami - amazon machine image, contains: template for root volume (OS, apps, could be ephemeral), launch permissions (which accounts can use it), block device mapping to attach (i.e. EBS snapshot)
private by default (can be shared), or make it public (but need to harden it https://aws.amazon.com/articles/9001172542712674 and https://aws.amazon.com/articles/0155828273219400)
regional (can only launch from region of image - probably because EBS snapshots are region-specific), but image can be copied to another region
select ami based on region, OS, architecture (32 vs 64bit), launch perms,
storage for root device: instance store (ephemeral) or EBS backed volumes - persistent, provision from EBS snapshot (fast), but can’t delete snapshot until AMI deregistered
subnet - separate resources within VPC, each subnet within one availability zone
termination protection - has to be removed before instance can be terminated from AWS console or API, defaults to no protection
advanced details - user data (script run on launch - this is not start) - debug by looking at /var/log/cloud-init.log files
storage - IOPS scales with provisioned disk (e.g. 8gb at 3 IOPS/gb = 24 IOPS), delete on termination flag (default will delete),
root volume can’t be encrypted (can encrypt with 3rd party, e.g. Windows bit-locker) on launch but via an AMI https://aws.amazon.com/blogs/aws/new-encrypted-ebs-boot-volumes/
tags - 10 max, good for billing/tracking
security groups - virtual firewall
sources: IP, CIDR block notation (whitelist source IPs, can’t deny like network ACL), other security group
types (SSH, HTTP, RDP, SFTP) knows TCP/UDP and ports,
all inbound traffic denied by default, can’t create deny rules (only allow), changes takes effect immediately
if you have inbound rule, don’t need outbound rule, rules are stateful (unlike network ACL)
source
key pair - SSH auth with identity file (or for Windows, to get logon password)
if you lose key pair, regain access to EC2 instance: by i) stop instance, attach it’s root volume to another instance and modify /home/ec2-user/.ssh/authorized_keys, or ii) use SSM “AWSSupport-ResetAccess” document which creates a new AMI
virtualization technologies
HVM (hardware virtual machine) - recommended, run OS directly on VM (no boot loader), hardware emulated (this used to be slow but same/faster than PV now)
PV (paravirtual) - were traditionally faster, boots with special boot-loader, could run on host without PV support (but loses hardware extensions)
stopping
keeps: elastic IP (still associated), private IPs (elastic network interface), load-balancer group (have to remove from pool manually), auto-scaling group (will detect unhealthy and deploy a new one)
lose: instance store volume data, except for reboot (vs stop/start) keeps everything
package manager
SSH user is “ec2-user”
yum
/sbin/service
volumes - virtual hard disk, exist on EBS, to attach to EC2 must be in same availability zone
attaching process:
use AWS console to create volume (needs to be in same availability zone as ec2 instance)
lsblk - to find name of block device, i.e. the volume
file -s /dev/<name of block device> - to see if there’s a file system there, “data” = raw blocks, no file system
mkfs -t ext4 /dev/<name of block device> - format disk to create file system
mount /dev/<name of block device> <directory to mount to> - add the disk to the OS
unmount /dev/<name of block device> - unmounts from OS
RAID (see Wikipedia for levels) Redundant Array of Inexpensive/Independent Disks
RAID 0
RAID 1
RAID 5
RAID 6
RAID 10
What
Striped volume across multiple disks
Mirrored
Block-level striping with distributed parity (last drive has bit acting as a checksum).
RAID 5 with double-parity
Striped (RAID 0) + mirroring (RAID 1)
Redundancy
None, 1 disk = data loss
1 disk = OK
1 disk = OK (can rebuild with checksum on parity)
2 disks = OK
1 disk within sub-array = OK
Performance
Read: fastest Write: fastest
Read: fast Write: fast (has to write twice)
Read: fastest Write: average (has to calculate/write to parity)
Read: fastest Write: slowest (has to calculate and 2x write to parity)
Read: fastest Write: average
Min # disks
2
2
3
4
4
Capacity
100%
50%
(n - 1)/n
(n - 2)/n
50%
Good for
Scientific/gaming (non-essential data)
Critical fault tolerance
Complex, not recommended by AWS (parity writes eats IOPS). If another failure during rebuild, screwed.
Same as RAID 5
Typically AWS would use 0 10 to improve IO (e.g. DB server), beyond 0/1 not talked about on http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/raid-config.html because redundant writes eats IOPs and cost
RAID snapshots done via multi-volume snapshot, EC2 instance does not need to be stopped
auto-scaling - provision new instances when LB health checks fail (i.e. “fleet” management). Scale in (reduce nodes) vs scale out (increase nodes)
launch configuration (define how EC2 instance is started:, AMI, instance type, IAM role, detailed CloudWatch, userdata, network IPs, volumes, security groups, ec2 key) + auto-scaling group. Immutable, need to create a new one and then update Auto-Scaling Group
vs launch template - (preferred method) templates can be versioned (create a “default” with common configuration and another version with more specific details). Immutable. Can easily create a new version of any other version.
allows T2/T3 unlimited bursting, mix instance types with on-demand/spot instances
default scale-in policy balances nodes across AZs: i) select AZ with most nodes, ii) taking into account nodes protected from scale-in (can protect individual nodes or whole group), iii) select oldest launch config, iv) select closest to full-billing hour
scaling policies:
“maintain” - keep a specified minimum number of nodes running
“manual” - you change the minimum, maximum or desired capacity.
min vs desired - will scale to meet desired - scaling policy doesn’t launch instances directly, it affects desired instead
“on schedule” - proactive cycle scaling, define new min, max, desired sizes. Can reoccur. Can’t schedule two action at the same - use for fixed increases (e.g. scale-out before nightly batch job then scale-in)
“dynamic” - based on a metric. “AWS EC2 auto-scaling” policies of:
target tracking scaling (try to meet some metric, i.e. thermostat. If metric proportional to workload use this, i.e. number of SQS messages to process)
uses instance warmup period during which the warming up instance isn’t counted (so scale activity not re-triggered as long as current scale activity covers it)
gotchas:
implemented as alarms above threshold will scale out, below threshold will scale in. At min capacity, this could always be in alarm state.
magnitude of scale activity is “current value / desired value” (e.g. currently 100% CPU, want 80%, scale 1.25x) but might be too small since 100% is cap. Tracking against unlimited metric better (e.g. queue size, request count)
step - step adjustment size based on size of alarm breach, responds to all alarms even if another scale activity in progress
step adjustment size can be “add 3 or -3 instances”, add/decrease a percentage (rounding towards 1 or -1), or exact capacity which can be scaled based on alarm size (e.g. if metric > 50, scale +3, if > 75 scale +5, if <30 scale exactly to 2
simple - old generation, has to wait for cooldown and health check before responding to other alarms
if multiple dynamic policies say to do different things, AWS picks the result that will have the highest resultant capacity
“predictive” uses machine learning to forecast future load for next two days. Needs 24 hours of history. Targets utilisation metric (like dynamic) and scales ahead of time. Might have to do from “AWS Auto-Scaling” console, not EC2 auto-scaling page
periods
health check grace period - after instance launched, give time for instance to configure itself before checking for health.
health check process: scaling activity to terminate, then later another scaling activity to replace with new instance
anything but “running” is unhealthy (including system “impaired”) triggering replacement
intervening in health check termination/replacement: i) disable health check so no termination ii) Can use CLI/API to set health check status manually (“custom health check”, e.g. accidental reboot marks instance for replacement, can add back to ASG if it’s not yet terminated)
cooldown period - min period of time between scaling activity giving instance time to absorb load. Applies to simple only
warmup period - for target and step (new-generation of cooldown, i.e. covers case where another bigger alarm triggered within period, so we should add even more capacity during scaling activity)
creating the group
probably want to chose all availability zones (i.e. HA in case a AZ dies). Can merge single-zone ASGs into one but editing one and deleting the others
can check either at ELB level or EC2 level
scaling policy - keep at initial size vs scale between x and y nodes based on certain conditions (e.g CPU utilization hitting 90% for 5 minutes, i.e “dynamic scaling” )
cooldown period (min time between scaling activities)
deleting the group: sets desired capacity and min capacity to 0. Need to detach if you want to keep the instances.
moving instances out of ASG with standby state (standby instance removed from ELB/target group, won’t receive traffic, optionally desired count decremented so no new instance added) so you can debug or more gracefully terminate instances when launch config/template updated (updates don’t replace new instances).
suspending and resuming scaling activities. Could be suspended if ASG can’t launch instances for 24 hours (administrative suspend). Some side-effects of suspending actions:
“Terminate” (could make group larger than max because ASG allows +10% temporarily during scale-out).
“AZRebalance” stop auto-rebalancing (will still balance during scale-out and will rebalance when enabled by launching then terminate),
“HealthCheck” (disable EC2/ALB checks, customs still run),
“ScheduledActions” (when resume will only fire scheduled actions that haven’t passed yet)
“AlarmNotification” (disable triggering scaling activity from CloudWatch alarm. On resume, breached alarms will trigger)
best practices
use 1-minute frequency to be more reactive to sudden changes. EC2 has default of 5-minute, can pay for 1-minute -> 7x cost of 5-minute)
enable “group metrics” collection so that actual capacity is shown in capacity forecasts when creating scaling plan
burstable T instances might be throttled if CPU utilisation target too high. Use unlimited for T3 or other type
small instance types have slow boot time so require longer cooldown/warmup (e.g. m1.small takes ~2m30s to do extras install and yum install stress but t2.large takes 1m20s)
from LogicWorks blog: probably don’t want to downscale too aggressively (they recommend only auto-scaling top 20% of load) since overwhelmed nodes will fail health checks and be replaced.
limitations = can’t span regions (only AZ), can use different instance types with AddInstance API (but ideally all same type), doesn’t handle sudden spikes (as time required to spin up instance) need to return 5xx errors
dedicated instance - EC2 instance on hardware dedicated to you. Hardware-level isolation. May share hardware with non-dedicated instances from the same AWS account. Flavours: default (shared hardware), dedicated (single-tenant hardware), host (dedicated host)
cost more, about $2/hour per region, but can use spot too (same price as regular spot instances apparently)
dedicated host - dedicated hardware rack for you. Costs per server (instances free)
why? compliance, hardware bound licenses (e.g. Windows Server, Oracle)
cons: one host per instance type (e.g. c3 types, or m5 types)
vs dedicated instance: instances are on hardware dedicated to you, instances bill per hour (extra over regular EC2), can use spot instances, can run different instance families
But hosts allow control of instance placement (ensure same hardware all the time), visibility of sockets/cores, and optimise licenses.
vs placement group: placement group is logical grouping, allows more flavours (e.g. spread, partition which aren’t supported by Dedicated Hosts; partition only supports 2 partitions with dedicated instances)
placement groups - logical grouping of instances within single AZ. Enables low-latency, 10Gbps network. Used for Cassandra clusters, http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html
Flavours:
clustered = low-latency, high network throughput (5-10 Gbps possible between nodes) group within single AZ (same rack) for low latency, high throughput.
required for Cluster Compute and Cluster GPU EC2 instances
but: limited capacity (should launch all up front), only one instance family type together (see dedicated hosts - same type recommended to increase launch success), single AZ
spread = spread across distinct rack (own network and power source) and AZs, reduces likelihood of simultaneous failure.
but: limit 7 per group per AZ, dedicated host/instance not supported
partition = group into partitions across racks (like spread) for really large cluster with different layers (put different layers in different racks).
HDFS, HBase, Cassandra are topology aware so can make intelligent storing decisions
tries to allocate evenly across partition or you can specify which partition
but: not supported with Dedicated Hosts (might as well use clustered)
restrictions: name unique within your AWS account, only certain types of instances can be launched (C compute optimized, G GPU, R/X RAM optimized, I/D storage-optimized; no T2), can’t merge groups, can’t add existing instance to group (can create AMI from existing instance and launch into group through), reserved instances reserve within AZ not a placement group
recommends: homogeneous instances (same size, instance family, and time because of reservation capacity)
can span peered VPC
to modify tenancy, placement group partition, use CLI and instance must be stopped.
performance - deploy across AZs to reduce contention on network? Udemy course recommended this over placement groups or different EC2 instance
stopping an instance: keeps: private IPs, elastic IP association, ENI (elastic network interface)
loses: data on ephemeral/instance volumes lost, host will likely change on start, public IPs, still in classic ELB/target (health checks will deregister it; if terminate auto-deregistered)
except on EC2-Classic: private IPs lost too, elastic IP lost (have to re-associate on start)
reboot: doesn’t lose anything
limits = 20 running EC2 instances per region
AWS ELB - Elastic Load Balancer
LB across availability zones (need a subnet in that AZ), only have DNS names, no IPs as they change and can be multiple (but can use elastic for NLB)
application - layer 7, newer, but HTTP(S) only, can do path-based routing, improved cloudwatch/access logs/health checks
vs classic - layer 4, lower level TCP/SSL or app-level HTTP(S),
internal/private load balancer - not internet accessible
health check, polls at interval, configure #requests for (un)healthy, unhealthy nodes removed (if in ASG will be terminated/replaced; could be added back if becomes healthy). ALB/NLB health check states:
initial - ELB is registering and performing health checks
healthy - is healthy
unhealthy - did not respond or failed health check
unused - not registered with target group or target group not associated with listener, or in AZ not enabled for ELB
draining - target deregistering and connections being drained (default 300 seconds)
health check at target group level of ALB/NLB.
when created, everything out of service, needs to poll
on delete, instances preserved
for high availability need two subnets (i.e. two public or two private, can’t create ALB one subnet)
enable cross-zone load-balancing (always enabled for ALB, NLB disabled by default) allows ELB node within AZ to use all zones (when disabled round-robins AZs and if uneven number of nodes in AZ, traffic won’t be even)
failure reasons: EC2 stopped, Apache not running, security groups not allowing
cost = #hours (or partial hours) up and LCU (load balancer capacity unit) which charges highest of TCP, UDP, or TLS:
new connections (connections or flows for network) - typically lower than request
active connections ( connections or flows for network) - peak sampled minutely
processed bytes in GB/hr
routing: application/network register targets in target groups (can register multiple EC2 instances on different ports, i.e. containers); classic registers instances (no container support)
IP target allows you to mix-and-match targets within a target group (containers and EC2 instances in case of still migrating workloads)
access logs - e.g. request was received, the client’s IP address, latencies, request paths, and server responses
SSL/TLS support:
ALB and NLB allow picking or uploading (post key details) cert to both ACM and IAM (IAM handling for certs only advised in regions where ACM not supported). Classic same except can’t upload to ACM.
CloudHSM for SSL offloading clunky. Use TCP NLB to terminate requests at web server. Web server (with Apache/Nginx with config telling it to use CloudHSM) has a fake cert which is derived from the real cert in CloudHSM.
Security policy - which ciphers SSL/TLS versions to support.
gotchas:
ALB has no static IP which can make firewall whitelisting by IP impossible or clients not updating DNS
scaling out takes 1-7 minutes + 60 sec DNS updates means long ramp up times. Sticky sessions make this worse by keeping traffic on the original nodes
if can’t take requests, return 503 response. Workarounds: contact AWS support to “pre-warm” ELB, or restrict growth to 50% over 5 mins (designed for this). re:Invent 2017 says up to 1.5M requests/sec fine but time period unknown
ELBs idle timeout is 60 sec so back-end apps need >60 sec otherwise on back-end drop, ELB will think it’s unhealthy
consumes IPs from subnet so need to ensure enough free IPs
NLB has extreme performance vs ALB
can’t mix target types (e.g. private IPs, Lambda, EC2 instances) within a target group. Private IPs allow different apps to be served on same port and host.
no X509 client authentication (use API gateway infront of ELB). However ALB does support OpenID and Cognito workflows, see here.
Application
Network
Classic
Target
- Targets in target groups - IPs within VPC or VPC-peer on-prem via Direct Connect (RFC 1918, RC 6598 100.64.0.0/10) - Lambda (for AWS migration)
- Targets in target groups - IPs within VPC or on-prem via Direct Connect (RFC 1918, RC 6598 100.64.0.0/10)
- Instances - Not via IPs (so no on-prem) - No concept of “target group”
Load balancer address
- Public: DNS over public subnet - Private: internal DNS
- Public: can have one Elastic IP per subnet. providing static IPs, otherwise IP in public subnet per AZ - Private: DNS to private IPs
- Public/Internet facing needs DNS to public IP - Private: internal DNS to private IP addresses
- ELB has IPv4/6 and dualstack DNS name - IPv6 addresses not supported in EC2-VPC (use Application) but IPv6 addresses supported in EC2-Classic
Proxy/Client IP
X-Forwarded-For
Terminates connection and new connection to target
Preserves client IP if target is instance. Doesn’t terminate connection.
If IP is target, then source IP is load balancer IP (as IP could be outside VPC), so have to enable Proxy Protocol v2 which uses a HAProxy TCP header
X-Forwarded-For for HTTP Proxy Protocol v1 for TCP
Security Groups
Yes
No which means: - can’t allow traffic to target by allowing SG of NLB - instead, the backend app has to allow by client IP if NLB target is instance ID (which preserves source IP) or ELB IP (i.e. subnet for targets as IPs) - backend app needs to allow healthcheck from ELB
Yes but EC2-Classic security groups different (you can’t pick the ELB’s security group, it has it’s own)
AZs / subnets
At least 2 AZs 1 subnet per AZ per LB
Can change after creation
Single AZ allowed 1 subnet per AZ per LB
One or more AZ 1 subnet per AZ per LB Need public subnets
Can change after creation
Static IP
No
Automatically allocate 1 (or elastic IP) per AZ. For firewall IP whitelisting
No
Routing Features
Simple rule-based routing (path/URL and host header). Docs here.
Nothing. Docs here. When you create listener you specify backend protocol and port.
Port forwarding
Different ports per instance (useful for containers)
Different ports per instance
Global port per instance
Sticky sessions
Yes via cookie (ALB-generated cookie only) or web sockets
No
Yes via cookie (generated by ALB of backend)
User authentication
OpenID Connect compliant IdP, social (Amazon, Facebook, Google via Cognito), corporate (SAML, LDAP, Microsoft AD via Cognito)
No
No
WAF
Yes
No
No
SSL termination
Yes Can re-encrypt to target but no client auth (AWS says already secure with VPC unlike shared Classic network)
Yes (TLS) No re-encrypt or client auth, do TCP pass-through and terminate on backend target
Yes Can re-encrypt and do client auth against backend if backend target is HTTPS
SNI support
Yes. You provide ELB list of certs and default only used if client doesn’t support SNI.
Yes. You provide ELB list of certs and default only used if client doesn’t support SNI.
No. Either present one cert with Subject Alternative Name set, or handle TCP only and terminate at EC2 instance
Health check
Active only at HTTP(S) level looking for HTTP 200-499 response
Passive (can’t be disabled/configured, observe how target responds, not for UDP)
Active at HTTP(S) looking for 200-399 response or TCP level
Active only at TCP, HTTP(S), SSL
For TCP, must connect successfully For HTTP(S), must return 200 OK For SSL, must handshake successfully
EC2-Classic/VPC-Classic Support
Not via instance ID. Peer VPC with ClassicLink (see EC2-Classic instances in same VPC region) and use private IP
Yes
Yes
Cost
Partial hour and LCU
Partial hour and LCU
Partial hour and per GB processed
Gotchas
Must be at least 2 AZs
- Route to only one port per instance (if need more, need multiple Classic ELBs) - No IP targets (no within network only) - No ECS support
Limits
Global: 20 LB per region
3000 target groups per region (shared with Network) 5 security groups per LB 50 listeners per LB 1 subnet per AZ per LB 1000 targets per LB 1000 targets per target group 1 LB per target group
3000 target groups per region (shared with Application) 50 listeners per LB 1 subnet per AZ per LB 200 targets per LB per AZ 1 LB per target group
100 listeners per LB 5 security groups per LB 1 subnet per AZ per LB
Containers
AWS supports Open Container Initiative containers (e.g. Docker, Windows containers). See here.
ECS is Docker only in FAQ
AWS Elastic Container Service (aka ECS), ec2 + docker (ship entire appliance onto Linux host, appliance can share parts of the OS unlike VM, isolates processes, not resources unlike VM)
managed docker (does deployments for you, but running it is EC2), i.e. container management, run, stop docker containers on EC2 instances, consistent (since all dependencies there).
still have access to OS of EC2 instance, you have to manage EC2 patching, scaling, monitoring. Scaling hard since ECS and Auto-Scaling group aren’t aware of each other
vs FarGate, it runs the tasks and you pay for requested vCPU and memory. Less control, no EC2 access (no EFS/EBS integration, ephemeral storage, networking is only elastic network interface per task)
docker = more efficient (vs having a VM for each application, can have multiple docker containers on one VM, don’t need virtual OS), scales, standardise environment (docker engine isolates OS and ship all dependencies), but tightly coupled with Linux (so no Windows)
docker image = build output of docker file. Contains everything need to run, immutable, may depend on other images (like CloudFormation template). Images stored in registry (DockerHub or AWS EC2 Container Registry, ECR)
docker container = actual running image, can move around, stop/start,
layers/union file systems = docker unions all the layers together to create one file system, makes updates fast/small (just changes the layer)
DockerFile = docker images built from layers with instructions/command (each command creates new layer stored in DockerFile)
Docker Daemon/Engine = runs on Linux to run containers
Docker Client = user to control containers/daemon
Docker Registries/Hub = public or private repositories of images
task definition = JSON template specifying one or more containers (which includes docker image(s) and docker repository, memory, CPU, disk, shared vol requirements, network port mapping to host, whether task runs when container finishes/fails, command to run when container starts, environment variables, IAM roles to use, plus how containers are linked to each other) - defines how to start containers
like auto-scaling groups but for containers but defined by specified/desired number of instances of task definitions
task - instantiation of task definition.
Note Docker defines slightly differently: task = atomic unit of scheduling (manager schedules tasks onto available nodes) where “container” is the task instance (Docker image, runtime, instructions); service compromised of multiple tasks, tasks comprise a container.
service - minimum number of tasks that run simultaneously, e.g. web server service. Auto-scaling group at this level
ECS cluster - logical grouping of container instances which tasks can be placed on (not related to EC2 placement groups which are AZ-specific)
can have different EC2 instance types, region-specific (but can be multi-AZ), container can only be part of one cluster at a time, IAM policies can restrict access to the cluster (e.g. restrict how can deploy to PROD cluster)
ECS scheduling
service scheduler - ensure specified number of tasks are constantly running and reschedules tasks if it fails (e.g. EC2 instance death), keeps tasks registered with ELB
custom scheduler - define own schedules, integrate with 3rd party schedulers e.g. Blox
ECS container agent - allows container instances to connect to cluster (provided by Amazon ECS-optimized AMI but can install on supported EC2 instances, only EC2 instances, no Windows support)
ECS service discovery - service mapping into Route 53 or AWS Cloud Map API for discovery. Limitations: FarGate only, private IPs only (if behind ELB, private IPs still used; private Route 53 hosted zone)
AWS CloudMap: purpose-built service registry using DNS CNAME lookup to get IPs or API which should avoid DNS cache issues. Health checks. Integration with ECS and Kubernetes.
Security - IAM roles control (who can access ECS, ECS tasks use IAM roles to access services/resources). Security groups are at the host/EC2 level, not task/container
Logging - in task definition specify log driver (e.g. awslogs) and destination
Scaling - cluster-level using EC2 ASG (note Fargate only at service level), can scale based on reserved CPU and memory usage. Service-level using Application Auto-Scaling (target/step policies), can scale based on workload (SQS queue length) or CPU
Storage - in general, writing to container’s file system is ephemeral (will be lost when container stops) as it writes to the “writable container layer”
Fargate is all ephemeral (“Fargate Task Storage”, can’t get persistent storage). Example on documentation is WTF database scratch space, Fargate pointless???
does support bind volumes (see below), could use EFS instead
EC2 launch has persistent storage as follows (these are Docker concepts, see here):
docker volume - Docker managed volume (stored on host and exclusively accessed by Docker). Usable by multiple containers within a task; and persistent storage on EBS (EFS in beta Feb 2020) that can be shared by multiple containers within a host. Or 3rd party driver storage (i.e. EBS/EFS)
bind volume - stored on host directly (not Docker managed so very performant but dangerous (can modify host OS, OS can modify it)). Persistent* container storage (remains if container stopped *but will die if host dies). Can shared across containers within instance, or share defined data volumes at different locations on different containers within instance. Can be used with Fargate.
Limits: hard-limits: 1 load balancer per service, 1000 tasks per service (desired count), 10 containers per task definition, 10 tasks per instance (host)
soft limits: 1000 clusters per region, 1000 instances per cluster, 500 services per cluster
versus Elastic Beanstalk - Beanstalk uses ECS under the hood and can launch RDS too but is simpler (i.e. Beanstalk you upload your application image and AWS does the rest, ECS gives you direct access to how things are deployed), CloudFormation template is naff name/value pairs
ECS (on EC2) should be more efficient (can binpack containers onto same machine), load balancing is around services (Beanstalk is EC2 auto-scaling)
AWS ECR Elastic Container Registry - scalable/durable Docker registry for docker images, uses IAM, developers can push/pull/manage images
Tagging. Docker images are tagged with a “repositoryUri = aws_account_id.dkr.ecr.region.amazonaws.com/hello-repository” tag, add image tag (if missing then “latest” - would require “Force New Deployment” to redeploy), then push to ECR.
AWS Elastic Beanstalk = helps deploy apps and provision what’s required, “AWS for beginners”, automatically deploys, capacity provisioning, ELB, auto-scaling to app health monitoring. OS patches done for you (opt-in)
you just upload docker, Java, NET, PHP, Node.js, Python, Ruby, Go, e.g. Java on Tomcat uploading a .war (or without Tomcat) on Linux + Apache HTTP Server 2 (or nginx)
less control, no CloudFormation (but is used behind the scenes to create the environments), during creation don’t have direct EC2/ELB access (but do have access afterward). Instead have option settings (but precedence order feels naff, changes done to environment directly (console/API) take precedence over those in the file and saved config which makes infrastructure as code more difficult - what’s the truth?)
handles for you: auto-scaling, ELBs, cloudwatch logs/alarms, DB/RDS integration (or can auto-provision RDS/DynamoDB but probably don’t want to because it will be deleted when application is deleted), environment variables to deploy TEST v PROD)
has concept of environments (dev, test, prod), defines application versions
uses instance storage, so need to provision persistent storage (i.e. EBS, EFS, S3, RDS). For custom logs, need to configure CloudWatch agent
Andrew’s hands-on notes:
Beanstalk applications have their own URLs (under us-east-1.elasticbeanstalk.com). Blue-Green swaps Route 53 URL of two environments
CodeStar pipeline creates 3 CloudFormation stacks: i) CodeCommit, CodePipeline, CodeBuild, IAM entities, S3 buckets, Elastic Beanstalk and two child stacks: ii) “infrastructure” as the elastic beanstalk application (can be deleted), iii) which in turn creates a stack for each environment
CodeStar pipeline updates existing application version (not creating another version), so lose track of what versions are deployed to what environments in multi-environment apps
Can’t rely on ELB for zero-downtime failover because health check is periodic (e.g. 3 consecutive failures checking every 15 seconds)
Blue-Green requires: i) deploy alt environment ii) blue-green to alt iii) do code change which deploys to orig environment iv) then blue-green to orig
Elastic Beanstalk Deployment Types
All at once (in-place)
Rolling
Rolling with Additional Batch
Immutable
Blue-Green
What
All nodes simultaneously
Update nodes in batches
Update nodes in batches with an extra batch to maintain full capacity
Deploys to new instances under new auto-scaling group (but same ELB) and all new instances must be healthy (or at least warning-level) to swap. Both old/new will serve content. Avoids partially complete rolling deploys.
New instances and new ELB, then swap DNS/URL
Supports single-instance environment
Yes
No (becomes all at once)
No
Yes
Yes
Downtime
Yes
No
No
No
No
Deployment time
Minimum
Lower
Medium
Higher
Higher
DNS change?
No change
No change
No change
No change
Change
Impact of failed deployment
Downtime
Single batch out of service, successful batches run new version
Minimal if first-batch fails, else similar to rolling
Minimal
Minimal
Rollback process
Manual redeploy
Manual redeploy
Manual redeploy
Terminate new instances
Swap URL in BeanStalk environment config
Code deployed to
Existing instances
Existing instances
Existing instances
New instances (termination and replacement for VPC or launch config changes) Some changes can be done to existing (e.g. health checks)
New instances
CodeDeploy
“AllAtOnce” for EC2/on-prem only (no Lambda/ECS)
“OneAtATime”, “HalfAtATime”
Andrew implied: for EC2 can terminate old instances. Lambda? ECS?
On-prem: not supported.
EC2: at ELB level with originals deregistered or terminated. New instances created from ASG or specify manually.
Lambda: Always blue-green. New version of same function. Choose how traffic is shifted (canary, linear, all-at-once).
ECS: see below
ECS
N/A
Supported specifying minimum and 100% maximum healthy
Rolling update with 100% minimum healthy and maximum <200%
Rolling update with 100% minimum healthy and 200% maximum
Using CodeDeploy. Requires ALB/NLB with production listener (with original target group) and optional test listener (for new target group and validation). Traffic routed all-at-once using blue-green.
Gotchas
Can’t change resource (ELB) and code simultaneously. Either do in two updates or use blue-green
- Needs data layer (RDS) independent of environment - DNS change won’t work if clients don’t respect TTL. Perhaps use 3rd party tools to add/remove instances from ELB (re-using ELB ensures it’s pre-warmed) - Really expensive to have two PROD environments - Beanstalk doesn’t actually do this for you, you have to clone a Beanstalk application and then swap URLs - Difficult for stateful apps (store session separately) - Use Route 53 to do canary approach.
AWS EKS Elastic Kubernetes services - managed Kubernetes. AWS managed master nodes (control plane: manages configuration, cluster state, schedules which node pod (one or more containers with shared network/storage, co-located/scheduled) runs on, deployment of pod)
pay for master node (is multi-AZ for HA), otherwise EC2 under the hood (like ECS)
control plane is single-tenant (one cluster but each cluster could manage multiple applications)
appears to have some AWS integration: IAM entities map to Kubernetes authentication system
deploying apps: after cluster created (nodes provisioned), you “apply” config (spec’d in JSON) to the cluster via Kubernetes CLI
auto-scaling flavours:
“cluster autoscaler” - adjust EC2 instances in cluster (for when pods fail to launch because not enough resources or under-utilised nodes). Uses EC2 ASG. EKS node groups are AZ-specific (so need one per AZ) but can use --balance-similar-node-groups to keep AZs same size.
“horizontal pod autoscaler” (HPA) - scale-in/out deployment/replication controller/replica set based on metric (e.g. CPU utilisation). Kubernetes-specific implementation (Kubernetes metrics server)
“vertical pod autoscaler” - adjusts CPU/memory reservations for pods. Need metric server too.
LightSail
BeanStalk
ECS (EC2)
ECS (FarGate)
EKS
Summary
For students, small business that need virtual compute. Template for Wordpress, LAMP
Kubernetes-compliant, non-AWS, AWS manages control plane for you.
More AWS-independent. Own Route 53, ALB, CloudWatch equivalents
Pay
EC2
EC2
Requested vCPU and memory
Master node (control plane) hours, EC2
Control/Flexibility
- Very Low Dumbed down console, push-button add ELB (auto-scaling). - Limited AMIs and RDS options. Simplified pricing.
Low Less control options than provisioning yourself (console has own screens and own API). Don’t control: security groups, ELB health checks, RDS (replicas).
Manual config changes applied to console higher precedence than .ebextensions in app bundle which makes console changes impossible to manage.
Support task placement: binpack, random, spread (so in theory, better resource utilisation)
Can select EC2 GPU optimised tasks (not supported in Fargate)
Low
High, lots of 3rd party add ons
Support task placement: binpack, random, spread
Persistent storage
EBS
EBS, EFS
EBS, EFS
EFS
EBS, EFS
Access to underlying EC2 instances in console?
No. Can convert LightSail to EC2
Yes
Yes
No
Yes
Who does EC2 patching?
You via SSH
Either. Can opt-in to AWS patching platform for you. Does blue-green so no downtime.
You
N/A
You
Scaling?
AWS
AWS via auto-scaling group
AWS at service-level via auto-scaling groups you tweak
AWS at service-level (multiple instances of tasks)
AWS at pod level (co-located containers) via Horizontal Pod Auto-Scaler (HPA)
Load Balancer
AWS via ELB (classic, network, application)
ALB, NLB, Classic
ALB recommended: service can service multiple ALBs and ports, support dynamic port assignment (i.e. have >1 task on one node), path based routing so one ALB port can redirect to different services
ALB, NLB
Provision RDS?
Yes
Yes but probably bad for PROD since it ties RDS to app lifecycle.
No, not about provisioning
No, not about provisioning
No, not about provisioning
Network
ENI per container (could run into ENI limits per EC2 instance), allows security group per container
ENI per task (which means IP)
ENI per pod (more efficient, but looser security control)
Security
IAM at container/task level
IAM at task level
IAM at worker/ec2 level
Supports App Mesh?
Not during create but could add agent later.
Not during create but could use custom AMI
Yes
Yes
Yes
AWS App Mesh - application-level networking (let app talk to communication bus or universal data plane). Consistency (protocols, auth, retries), service discovery, easier to monitor, troubleshot, auto reroute in failure
Envoy proxy on each app
Doesn’t seem to integrate that much with IAM (still need IAM to interact with service but currently doesn’t support resource-level control).
Vs Lambda which is next iteration of cloud compute (monolith -> containers -> serverless), service discovery easier since abstracted to service level already (not individual instances)
AWS Lambda = run code without provisioning service (i.e. no EC2 instances, AWS handles OS, patching, scaling, etc), supports Java, Python, Node.js, C#, Ruby, Go, containers, or custom runtime to run other languages
pay used compute time and memory (so if not running, no cost), first 1M requests free, then 20c/1M requests; memory fractions of GB/sec
application: collection of Lambda functions and other resources to do a task, makes it portable, deployable via CodePipeline/CodeDeploy, CloudFormation (with rollback)
could react to event changes (e.g. S3 bucket, DynamoDB), e.g. if image uploaded to S3, convert to thumbnail, Lambda@Edge (CloudFront to localise site), could react to HTTP requests via Amazon API Gateway or gateway calls
Lambda@Edge - compute at regional edge locations. Triggered on CloudFront receiving request, before and after CloudFront talks to origin, before CloudFront responds to requester
why? rewrite URLs for A/B testing based on a cookie, inspect user-agent/referrer to 302 to different images (mobile), call external resource for additional authorisation checks, normalise query string to increase cache hits, custom error page on error response from origin
vs CloudFront functions which are lighter/faster/less-features but execute at closer to user (at points of presence)
Layers - include common libraries (optionally including custom runtime) as a “layer”. Each function can be ontop of 5 layers. Partners build layers to help integrate with their apps. Makes it hard to test locally (tooling needs to catch up)
gotchas: stateless so need other persistent layer (S3, DynamoDB, Elasticache); functions need to be idempotent (in case the up to 3 retries kicks in)
patterns: fan-out architecture (one Lambda function invokes other functions to do tasks concurrently)
never pay for idle, use x-ray to identify areas where application is waiting (calling external API), try to multi-thread them (beware pricing rounded up to nearest 100ms) or use step functions to coordinate (free waiting especially for week-long waits)
limits: 128MB to 3GB RAM, timeout 1 to 900 seconds (15 mins); 1000 concurrent executions; request body payload 6MB (sync), 256K (async). Max deployment size 250 MB unzipped. /tmp space is 512MB.
AWS SAM (Serverless Application Model) - open-source framework for building serverless apps
uses YAML and extension of CloudFormation (creates CloudFormation templates behind the scenes, so can use everything CloudFormation can include)
has its own CLI (start, debugging, deploying)
lots of examples in Serverless Application Repository - library of public/private serverless apps you can deploy (free). Can build library within organisation to prevent duplication
vs Marketplace is paid
enables local testing/debugging!
vs “Serverless Framework” which supports other cloud providers
AWS EventBridge - serverless event bus connecting AWS and 3rd party SaaS apps with routing rules. Event-driven applications.
events are structured in a schema registry allows filtering/routing rules to be defined
pre-integrated 3rd party services exists as sources (lots) and destinations (e.g. feed DataDog alerts to run auto-mediation pipeline, analytics; feed PagerDuty events). Seems 3rd party into AWS only (not out from AWS to 3rd party)
use cases: consume ZenDesk support tickets, PagerDuty alerts, DataDog alerts to trigger workflow
vs SNS: EventBridge for reaching to events from AWS/SaaS application, pre-baked integrations so no code. Standardised event structure to ease forwarding rules.
SNS for low-latency/high-throughput microservices integration (EventBridge currently 50 requests/second and 400 puts/second)
VM import/export - move VMs to/from AWS. Convert VM image to AMI. Export to VMWare, Hyper-V, Citrix.
Being replaced by SMS
how? copy image export to S3, use CLI to start import task (which converts to AMI), probably need to install Amazon agents/drivers. Can also import running instances but SMS better for this.
Server Migration Service (SMS) - replicates VMWare vSphere, Windows Hyper-V, and Azure VMs to an AMI
how? Service MIgration Connector (FreeBSD VM you install on-prem which creates snapshots), needs SSH/RDP access,
can integrate with Migration Hub, launch applications from console or create CloudFormation template
vs VM Import/Export: SMS provides automated, live, incremental replication (up to 90 days) and AWS console
limits: 50 concurrent migrations, VMWare only, 90 days to replicate per VM
AWS Lightsail - simplistic EC2, EBS, DNS, 5 static IPs $/hour ($0.007/hour at cheapest = $5/month), doco and AWS console really dumbed down
AWS Batch - management tool for creating/managing/provisioning/executing/scheduling batch-oriented tasks on EC2
good for: schedule recurring tasks that don’t require heavy coordination logic, e.g. end of month reporting, rotating logs on appliance
supports popular batch computing workflow engines and languages (e.g., Pegasus WMS, Luigi, and AWS Step Functions), don’t worry about servers, scaling, handles dependencies, leverage spot pricing on EC2 marketplace
use cases = media-transcoding, financial analysis, end of trade day analysis, bulk processing of claims
creation how to: i) specify compute environment (managed vs unmanaged, spot/on-demand instances, vCPUs), ii) create jobs in a queue with priority, assign to compute environment, iii) create job definition (script/JSON, environment variables, mount points, IAM role, AMI, container image, etc) to run the actual job, iv) schedule job
managed environments will scale automatically vs unmanaged (use dedicated hosts, EFS, larger EBS volume, different OS - you provision and install Amazon ECS agent + Docker)
pricing = no extra charge ontop of EC2 or lambda
Compute Asides
VMWare stuff
“Compute Gateway” firewall rules prevent uplink by default (separate to EC2 security groups). So for domain join, would need to allow.
Security and Identity
IAM identity access management = control users, roles, groups, password policy, multi-factor, federation (Active Directory, FaceBook, LinkedIn), temp access. Not region-specific
users = end users (people)
groups = groups of users under one set of permissions
roles = assigned to AWS resources, e.g. ec2 instance has role to access s3 (no username/password for this). Roles for services, cross-count access, identity provider access. Credentials periodically rotated transparently
trust relationship - defines who can assume the role, e.g. a person or a service (ec2.amazonaws.com)
identity broker - takes requests from identities and looks up against identity store of federated identity provider
IAM policy: PARC-model (principle, action, resource, condition)
principle: entity that is allowed/denied access. Can’t be specified in identity-policy (principle is already implied). Could be IAM user, root account, account, federated user (e.g. all federated Google users), role, service (e.g EC2), everyone (“AWS”."*")
action: or NotAction (everything but). Stars (“iam:AccessKey” dangerous as new capabilities released, you’ll get these permissions too
resources: or NotResource (everything but), object being requested.
condition: reading down is AND, across is OR. To make more complex ORs, create multiple condition clauses
power user policy = allows full access to AWS except IAM
managed policy (standalone, can be attached to multiple identities) vs inline (attached directly onto single identity, not recommended)
resource-based policies = attach to resources (e.g. S3 bucket). Cumulative in nature (i.e. effective permissions is union of resource and identity permissions)
gotchas: within account is union with other policies; but cross-account is intersection (although might be easier to assume a role when managing cross-account)
note S3 has two resource-based mechanisms: bucket policy and ACL
permission boundary - defines maximum permissions (but doesn’t grant by itself) for IAM user or role. Doesn’t restrict resource-based permissions
relationship with identity: intersect
vs Service Control Policies (SCP) - only those at intersection of identity-policy, SCP, and permission boundary allowed
Service Control Policies - see below under “AWS Organizations”
session policies (“scope down policy”) - passed programmatically as part of STS assume role calls and further restrict permissions (i.e. intersection of identity and resource (principal = user), union of resource (principal = session), e.g. restrict SFTP users to only parts of the S3 bucket
when creating EC2 instance, IAM role can be applied during create, can add/change from console or cmd line
creating user, access key ID (like a username) and secret access key (like a private key) are auto generated. Only shown once so have to download
Access key used AWS SDK, CLI, HTTPS calls. Not used for AWS console (not an actual password). To grant console access need to create password and grant policies
owner = AWS account owner (ID number)
enforcement order:
no policies = deny, 2) gather all “policies”, look for explicit deny, 3) look for explicit allow, 4) deny
where policies is not just IAM:
IAM: inline vs managed policies
organisation policy (i.e. service control policies, SCP)
STS: “scope down policies” or “session policies”
specific service policies, e.g. resource-based policies for S3
interactions with multiple authorisation policies, e.g. S3 bucket policy vs object ACL vs service role. Generally:
unions the permissions and takes the most restrictive
AWS Organisation could deny access even for root user
from ACloudGuru exam sim: “The AWS service receives the request. AWS first authenticates the principal. Next, AWS determines which policy to apply to the request. Then, AWS evaluates the policy types and arranges an order of evaluation. Finally, AWS then processes the policies against the request context to determine if it is allowed.”
typically RBAC (role-based access control) using least-privilege, but alternatively ABAC (attribute-based access control) where identities and resources tagged and access allowed if they match, e.g. “access-project” tag with different values
active directory federated services credentialing, i.e. use AD to authenticate into AWS console or CLI
user signs into https://fqdn/afds/IdpInitiatedSignOn.aspx with corporate creds
get SAML assertion (cookie) which says your signed on
user’s browser posts SAML assertion to AWS https://signin.aws.amazon.com/saml
federated user (not like a permanent entity like IAM user, they assume a federated role)
granting S3 access to another AWS account, refer to other account by AWS ID
by default users can’t do anything, need policies
multiple AWS account management
multiple accounts for: separation of environments (PROD vs non-PROD), limit blast radius, cost allocation (things like network traffic can’t be tagged), delegate authority, compliance, overcome API throttling limits, ease discoverability
when to use multiple accounts: isolate administrative controls between workloads, limit visibility/discoverability, limit “blast radius”, isolate recovery/audit (implement immutable logs)
Service Control Policies (SCP): centrally manage maximum permissions available to sub-accounts including root (SCP doesn’t apply to master account), cascade deny down hierarchy (must not be explicitly allowed at all levels in hierarchy). E.g. no one can use new services (have to be approved)
doesn’t apply to: master account, actions via service-linked roles, managing root credentials (even child, e.g. root password, access keys, MFA), more, only applies to principles/users of the organisation (e.g. S3 bucket that allows non-organisational users to access, can’t block with SCP)
by default: organisations have an allow all, but when you customise it, it’s removed (so you have to allow all, and then deny; or just allows only)
can centrally manage service quota limits and bootstrap new accounts with higher limits
master account invites other accounts to join organisation
IAM user/role access between accounts:
“OrganizationAccountAccessRole” created in linked/member account if account created via Organizations which has full admin access to linked account (since no other users/roles created). Should probably delete and replace with lower privilege roles.
Allow IAM users in trusted account to assume role in trusting account. Cren creating role (that other accounts assume), IAM has a wizard to specify another account as the trusted entity.
vs AWS Control Tower: Control Tower abstracts AWS services (including Organizations) to automatically setup well-architected environments (i.e. automate multi-account setup, Config items, Config Rules, SCP). Can’t be used if Organization already in play. Simpler (less flexible) than Organizations
Tagging
Resource groups
Consolidated billing
AWS Single Sign-On - centrally managed SSO across accounts. Can integrate with on-prem AD. User portal to discover apps and sign in, get temp CLI access keys. Centralise managing user permissions, creating users (e.g send email for them to create a password)
structures:
identity account structure: manage all user under one “identity account”, use trust relationship to assume role in sub-account (e.g. operations account has an admin role, marketing account has a viewer role). Could split by business unit/department, geography, environment. Could be federated users (e.g. AD)
logging account structure: centralised logging repository and account, secured to be immutable. Use SCP to prevent sub-accounts from changing log settings (i.e. log to centralised immutable repository)
publishing account structure: common publishing account of artifacts (AMIs, containers, code, AWS Service Catalog, CloudFormation templates, EBS volumes). Permit sub-accounts to use pre-approved/standardised services/artifacts
information security account structure: hybrid of identity and logging. InfoSec account has users and centralised logging
central IT account: hybrid, IT account has users that are assigned to sub-account roles, and publishes standard artifacts, possibly logging centralisation too
other re:invent account types
custodian account test and verify other accounts (bootstrap accounts with a ReadOnly and ReadWrite role and allow custodian account to assume role - to stop instances to save money, stop attacks). Use 3rd party tooling (since Trusted Advisor and Config are region-specific)
by environment: development (devs can read/write), staging (like production so read only), production (read only), DR
multi-tenant within an account via IAM and ABAC (i.e. tag resources with project/department/user which controls who can modify)
to minimise blast radius, critical shared services (Direct Connect, DNS) could be in their own account
individual developer sandbox accounts for experimentation/training. Set budget which teaches cost-aware architecture.
best practices:
define AWS account-creation process: doesn’t need to be centralised but at least stanardised and have business inventory
define company-wide usage policy: what services, minimal security baseline for different scenarios
create information security account: to centralise security monitoring and managing identity, compliance
use automation/scripts to standardise/baseline security/provisioning
example: an account per OU (organisational unit), consolidated billing into one specific billing OU (get economies of scale by combining usage), consolidated security (manage users, federated identities, assume roles in sub-accounts
AWS Directory Service - multiple types
AWS Cloud Directory - cloud-native directory to control access to hierarchical data (e.g. org chart, course catalog). Good for: cloud apps that need hierarchical data with complex relationships
vs AD/LDAP: AD is single hierarchy, Cloud Directory can be multi-dimensioned (e.g. org chart by department, geographic)
Amazon Cognito - sign-up/in functionality (so you don’t write these yourself), scales to millions of users, federates to social media services. Good for: developing consumer app or SaaS
AWS Directory Service for Microsoft AD - managed MS-AD (standard or enterprise edition) on Windows Server. Good for: enterprise, you need LDAP
can join Linux EC2 instances by: install Kerberos packages, realm join as a user that has domain join permission, update sshd to allow password authentication, restart, SSH in, add “AWS Delegated Administrators” to sudoers list
hybrid cloud - could be a forest with trust relationship to on-prem for just cloud side (serveres)
trust relationship between on-prem and cloud: supports one-way (incoming, outgoing) and bi-directional
AD Connector - on-premise users log into AWS (console, EC2, Workspaces, WorkDocs, QuickSight, Chime, Connect, WorkMail) with their existing on-premise AD credentials. EC2 instances can join AD domain. Good for: SSO for on-prem
Simple AD - low-scale, low-cost MSAD-compatible implementation on Samba. Supports users, groups, group policies, and domains. Supports Kerberos-based SSO .Good for: simple user directory, LDAP compatible.
vs AD Connector: Simple AD has no MFA (AD Connector has MFA using existing RADIUS-based MFA infrastructure), no trust relationship support
vs Directory Service for Microsoft AD - Microsoft AD feature rich, supports 5000+ users
IAM integration - assign IAM roles maps to users or groups
AWS Inspector = add agent to EC2 and check for security vulnerabilities using a rules within an assessment package (CVE, CIS reports, best practices). Predefined rules only (no custom rules).
findings can go to SNS to Lambda and trigger Systems Manager script to run OS yum/apt update.
AWS Security Hub - view security alerts, compliance status across accounts. Gathers info from Inspector, GuardDuty, Maice. Can run automated checks (e.g. CIS AWS Foundations Benchmark). Graph compliance, send findings.
AWS WAF Web Application Firewall = evaluates request signature and does rule regex matching (e.g. HTTP method, URI, query params, header), rate-based rule (403 if request rate too high from IP), AWS marketplace offers blocking based on IP reputation , geo-restriction.
Pre-baked cross-site scripting (XSS) and SQL-injection detection (just tell it what parts of the request to examine), IPs, countries, or requests too large.
AWS CloudHSM = hardware security module (secure storage and management layer), secure cloud with HSM device, multi-AZ in a cluster, single-tenant, can call from peered VPC, VPN, or Direct Connect
possible to sync with on-prem HSM (contact AWS)
exam said to place near EC2 instances over placing within a private subnet (even with security as dominant requirement)
integration: not as well integrated as KMS requiring custom scripting, but can go via KMS is used as a custom key store, broad 3rd party support
SSL offload from web servers, act as issuing CA, integrate with TDE (Oracle Transparent Data Encryption) for Oracle on EC2 (not RDS)
flavours:
“classic” CloudHSM: based on safeNet Luna SA device, HA via second device, FIPS 140-2 level 2. Upfront $5K + hourly cost. Deprecated as vendor has announced end-of-life for device (in favour of new version which AWS doesn’t use), AWS will terminate in April 2020, so need to migrate to current iteration.
current CloudHSM: proprietary device, clustered, FIPS 140-2 level 3. Hourly price
gotchas: classic not highly available (need 2 devices across 2 AZs), can lose keys as only backs up once per day, so need to send to multiple HSM devices. Limited integration (no ELB integration, have to terminate on web server).
limits = must be within a VPC, cloudtrail has no device/access logs (that’s CloudWatch event logs)
vs KMS: HSM under your exclusive control (you control root of trust; vs KMS where AWS manages this), need FIPS- 140-2 compliance, integration with apps (Java JCE, PCKS#11, Microsoft CNG), high-performance in-VPC cryptographic acceleration (bulk crypto, SSL offloading). But terrible availability
AWS KMS - Key Management Service = customer master key (CMK), which could be imported or AWS generated. Symmetric keys. Can uses CloudHSM (Hardware Security Modules). Can CloudTrail who’s using keys. Region-specific
why AWS-generated over import own CMK? AWS-generated can rotate and re-encrypt data for you automatically, define permissions on key. Tight integration with AWS services (e.g. encryption at rest for S3, EBS, DynamoDB).
CMK for: generate your own key material to your requirements, additional DR (although CMK held at lower durability than AWS-generated keys), you need to delete it from AWS and reimport later.
importing CMK: define the key metadata in AWS. Download wrapping public key and token, encrypt key material with wrapping public key, and then encrypted with token.
compliant: PCI DSS Level 1, FIPS 140-2 Level 2.
can use custom key store (CloudHSM + KMS) to store keys instead of KMS default to achieve FIPS-140 level 3, achieve single-tenant model, immediately remove key material (default requires waiting period of 7-30 days) and prove it, or do auditing outside KMS/CloudTrail.
Envelope encryption - KMS generates “data key” and sends to service to do encryption (vs sending the data over the network). Data key is encrypted with master key. Encrypted data key stored by service. To decrypt need master key to decrypt data key
price is $1 per CMK including versions/rotations (need to keep old keys to decrypt old data), and $0.03 per 10,000 requests
programmatically apply “grant”/revoke before/after encryption step to minimise time. Can also have constraint context that grantee needs to provide (case-sensitive key-values, e.g. {“Department”:“Finance”,“Classification”:“Public”})
vs key policies which are static
limit 1000 keys per region. Default keys (if no KMS key specified) are not counted against this. Region-specific, can’t be shared over region.
vs Secrets Manager: KMS tailored for encryption keys, Secrets Manager more tokens/credentials
AWS Secrets Manager - store passwords, encryption keys, API keys, SSH keys, etc (not PKI certs, that’s ACM). Versus storing in physical vault.
access via API with fine-grained control and IAM
rotation of credentials on RDS (for MySQL, PostgreSQL, Aurora) via Lambda (so could do custom integrations: rotate EC2 Oracle passwords)
keys stored securely with KMS keys (you specify master key) using envelope encryption
AWS STS - Security Token Service - grant users limited and temporary access to AWS resources.
Good for: temporary access (expire), users map to roles
Users come from:
enterprise identity federation, i.e. Active Directory. Uses SAML + SSO and grant based on AD credentials. User doesn’t need to be in IAM.
web identity federation with mobile apps (e.g Facebook, Google, OpenID)
cross account access - user access AWS resources belonging to different AWS account
EC2 IAM roles uses STS under the hood (or at least the concepts) and other services
IAM
Terminology
Federation = combining/joining a list of users from different domains (e.g. IAM and AD, Facebook)
Identity Broker = service that allows you to take identity from point A and join to B (i.e. mapping between domains), does verification of user creds
Identity Store = service to store identities, e.g. AD, Facebook
Identities = a user within a domain
Scenario - company employees log into EC2 web site and need access to files in their S3 bucket. Flow
user logs into the reporting application
reporting app sends creds to identity broker
identity broker verifies creds against company LDAP (this is before STS; STS does not check identity sources) for authentication and authorisation
identity broker calls STS, GetFederationToken(duration of token, name of federated user, JSON IAM policy)
can’t provide more power than the IAM user of the caller of GetFederationToken but can scope down
STS checks policies and returns: ARN of federated user, access key, secret access key, session token, expiration date (default 12 hours, max 36 hours, min 15 mins)
identity broker returns STS back to report app (assuming access is valid)
reporting app hits S3 with creds
S3 checks perms against IAM
IAM verifies creds are good
Web Identity Federation Playground - to play around
API:
AssumeRole called by IAM user/role, typically used within account or cross-account access. Param is role.
could implement Token Vending Machine with this if sitting on EC2 (can’t do client-side as they IAM perms to call this API)
Can 2FA/MFA
AssumeRoleWithSAML any user that passes SAML authentication response from pre-configured identity provider
AssumeRoleWithWebIdentity any user that passes web identity token from pre-configured Amazon Cognito, Login with Amazon, Facebook, Google, or any OpenID-compatible IDP
returns access ID, secret access key, and security token. Might be scoped down.
Token Vending Machine concept - common way to issue temp credentials for mobile apps.
Modes:
anonymous TVM: access to AWS services-only, doesn’t store user identity
identity TVM: registration, login, and authorisations
but AWS recommends Cognito with its SDK (security built in, don’t reinvent TVM concept)
vs Cognito: Cognito targeted at mobile, Mobile Cognito SDK has security built in
AWS Cognito - for mobile/web, user authentication/authorisation/user management. Save user data, e.g. preferences/game save
also handles users via OpenID Connect (OIDC) compliant IdP’s (e.g. Login with Amazon, Facebook, Google), SAML, or developer authenticated identities. These web users assume pre-defined IAM roles (default is one for authenticated and unauthenticated users)
concepts
user pool - user signs up and into user pool, user directory management, user profiles, security (MFA, phone/email verification, checks for compromised accounts)
identity pool - users authenticate into identity pool to obtain temporary AWS credentials to services and save user profile info
vs IAM: Cognito is mobile-oriented with dedicated APIs, easier integration with web federated identity providers (OIDC), anonymous (unauthenticated access), sync user data across devices-providers
AWS Certificate Manager (ACM) - provision, manage, and deploy certificates for encryption in transit
deploy to API gateway, CloudFront, ELB, Beanstalk, CloudFormation (can be a template resource)
public (identify resource on internet) vs private (within organisation only) certificates
public certs: take time to generate (validate via email or CNAME record), could be hours after validation. No SLA
private keys: easy to renew (no validation)
can generate certificates from AWS’s CA for free, supports wildcard domains, or import 3rd party
does renewal and revocation (seems manual, have to create support case)
can pay for private CA (within private network) which can add more info to certs, CN doesn’t have to be a domain name
private keys stay in region, except for CloudFront which needs to distribute them to edges
best practices:
self-signed: not recommended because can’t be revoked
vs uploading certs to IAM: IAM only advised in regions where ACM not supported. Doesn’t do renewal/tracking.
Intruder Prevention and Detection
detection (IDS) - passive (reacts to attacks), watch network/systems for suspicious activity (too many password attempts), anti-virus. Relies on detecting known attacks so vulnerable to 0-day
prevention (IPS) - active (take automatic action), sit behind firewall and actively scan traffic (drop malicious packets, block bad IPs), possibly reactive (blacklist IP based on building reputation)
typically combined with collection/monitoring agent on each system sending to Security Information and Event Management (SIEM) system, e.g. Splunk, SumoLogic, CloudWatch, S3
re:Invent 2017:
levels of separation: i) one VPC for everything (for early cloud migrations), ii) separate VPC per environment/customer, iii) separate AWS account by business unit, workload, customer, etc (highest separation)
multiple accounts make sense when: administration isolation of workloads desirable, limited discoverability/visibility, minimise blast radius, strong auditing/compliance requirements
recommend AWS accounts for: logging (metrics, compliance) and SecOps
consistently set permissions via CloudFormation (infrastructure as code) vs spreadsheets/forms of access controls
closed loop mechanisms: control (rules), monitor for breaches, and automate fixes:
e.g. no SSH access except from bastion: on SG modification/create, check source, use Lambda to fix it
e.g. no root escalation: CloudWatch with Syslog, freeze account
e.g. require min patch level: Systems Manager SSM agent to inspect and fix with patch manager
e.g. no public S3 access: object-level S3 object-level logging with CloudTrail event firing, with Lambda checking and applying private ACL
anti-patterns:
root account should use email-distribution list and MFA hardware device in office safe (not person’s email/phone)
account over-crowding (don’t use single account for everything, ownership issues, big blast radius) - instead, use separate accounts per team/business/workload
IP as identity (typically firewall whitelisting) - instead, build services as if they could be publicly addressable (even if just private) with their own authorisation/authentication, e.g. API gateway, service endpoint, or share resources (S3, SQS)
filling in security questionnaires (when customer audits you; not scalable) - instead attest/certify to a standard (HIPAA, PCI DSS, HITRUST). Implement common requirements across frameworks first
manual auditing (interviews, manual analysis of evidence) - instead automate: proactive controls enforced by code, continuous evidence-based auditing
not using AWS managed services (i.e. just EC2, patch version/methodology sprawl) - instead leverage platform (push responsibility to AWS), leverage AWS tools (CloudWatch/Alarms, Inspector, Config, Maice, Trusted Advisor) to standardise sec-ops
e.g. Lambda function running every hour to check all RDS instances are encrypted disks
over-the-wall software development (separate dev, QA, ops teams) - instead CI/CD, embrace DevOps (small self-sufficient multi-disciplinary team) and automate everywhere. If developer is paged at 3am for PROD issue, then they’ll write tests
vs SSDLC (secure software development cycle) - extends DevOps to DevSecOps where security tests are automated too
security whitepaper:
ISMS Information Security Management System:
define scope/boundaries - which regions/AZs/components (state why some are excluded)
define ISMS policy - objectives/principles, legal/compliance, how to measure risk, approvals
select risk assessment methodology - how risks categorised, framework exists
identify risks - map assets to threats
analyse/evaluate risk - calculate business impact, probability, risk level
choose security control framework - framework to set appropriate controls
get management approval - acknowledge residual risk, approval to continue ISMS
statement of applicability - includes: which controls chosen/why, which controls are in place, which controls planned to implement, which controls excluded/why
protect data at rest:
accidental information disclosure: permissions, encryption (file, volume, partition)
data integrity: permissions, integrity check (HMAC, SHA1/2, backup, S3 versioning
accidental deletion: permissions, S3 versioning, MFA on delete,
you control encryption method and KMI (key management infrastructure, i.e. storage/management)
storage still could be in AWS (e.g. hardened EC2), or your own on-prem
examples: S3 with client-side encryption, storage gateway (3rd party encrypted disk replicated), RDS with Oracle Transparent Data Encryption (TDE; encryption at column or tablespace-level; no KMS, prevents snapshots being shared), or special drivers that encrypt specific fields for you, EMR with 3rd party (LUKS) encrypted EBS volumes (managed support from 2019)
you control encryption method and management, AWS provides storage (CloudHSM)
use CloudHSM but need you KMI (management layer) on top (SafeNet Luna platform compatible). Not HA itself, need to deploy two in different AZs (and perhaps on-prem too). Custom app to do integration with AWS servers and encrypt/decrypt at HSM.
AWS controls everything (encryption method, key storage and key management)
use KMS (like a HSM device; private keys shouldn’t leave it, you send it data to encrypt/decrypt or envelope encryption to save transferring data about)
more examples: S3 with all flavours (SSE-S3, SSE-KMS, SSE-C), Glacier, EBS encrypted with KMS, Storage Gateway (server side)
Security Asides
SAML vs OAuth vs OpenID
SAML 2.0: both authentication (via assertions) and authorisation (via attributes), XML-based. Attributes allow passing of other info (e.g. user/group membership). Good for SSO, enterprise
OAuth 2.0: share protected assets without login credentials (delegated access, authorisation only). Issues tokens to client, application validates token against authorisation server. Best for API authorisation between apps
OpenID Connection: extends OAuth 2.0 with authentication. REST/JSON. Supporting web/mobile clients, JS. Good for SSO.
Deployment
Command Line
exists on some ec2 instances (e.g. the AMI ones), but can install on your own machine
needs credentials, > aws configure
AWS key ID is credentials for user
stores in ~/.aws so storing on ec2 instance bad (plain text so if machine compromised creds are leaked, hard to manage as you’d need to update all the ec2 instances),
need to use roles instead (ec2 instance will pick up role, so don’t need to do “> aws configure”), not actually managing users/credentials
user data via http://169.254.169.254/latest/user-data (what you provided on EC2 launch, e.g. a launch script)
General Deployments
types:
big-bang - most risk and time
phased-rollout - medium risk, smallest time
parallel-adoption - most time, smallest risk
deployment examples across fleet of EC2 instances:
rolling deployment - launch configuration can’t be modified, need a new one with the new AMI, then we can terminate old instances
A/B testing - route 53 to send % of traffic to different ALB (need two ALB across 2 auto-scaling groups). Common online stores for testing conversion
canary release - update one EC2 instance and monitor for errors
Blue-Green - route 53 across two ALBs (two auto-scaling groups). Redundant configuration, can rollback. Implement via route 53 to swap ALB, change ALB to point to already primed EC2 instances, Elastic Beanstalk (swap environment URL), clone stack using OpsWorks then Route 53 to new stack
blue-green not good when: code and data tightly coupled (can’t rollback if data changed, try to make schema forward/backward compatible), upgrade requires special upgrade steps (pre/post upgrade processor which makes rollback hard), COTS might not be blue-green friendly
Continuous Integration / Delivery / Deployment and whitepaper:
Integration: merging code frequently, common code base, automated integration testing. Associated with build/integration stage of deployment pipeline.
Delivery: automated release artifact (building, testing, packaging), automate deploy to non-PROD environments. Manual push to PROD. PROD migration should be business decision, not technical (as tests done at commit level)
Deployment: can release code to PROD automatically without outage
benefits: automate for speed/consistency, productivity (developers concentrate on code, automation does rest), increased quality (lots of automated testing), release faster with confidence
Pathway to CI/CD:
CI - use common code repository, culture change to commit frequently, merge upstream, create tests (unit, static code analysis)
Continuous Delivery - create a staging environment, more functional tests (integration, component, end-to-end system, performance, compliance, user acceptance)
creating a production environment (can do canary test on one server/region here before entire rollout)
continuous deployment - full automation
more maturity - more staging environments for specific performance/compliance/security/UI tests; unit tests for infrastructure code; integrate with other systems/processes (code review, issue tracking, event notification); database schema migration
3 teams each up to two pizza size (10-12 people):
application - own the app, unit tests. Goal to reduce time spent on non-core application tasks
infrastructure - provides infrastructure for apps. Uses CloudFormation/Terraform.
tools - build/manages CI/CD pipeline. Not DevOps (that’s the sum of all the teams). Uses CodePipeline, CodeStar.
Testing: as more complex, becomes more expensive/slower. Levels: unit (aim for 70% of tests to discover bugs early), service/integration/component, performance/compliance, and UI tests.
Building up the pipeline:
start with minimum viable pipeline (automated unit tests). CodeStar helps here.
continuous delivery: need to automate packaging deployment. CodePipeline could use CodeDeploy, OpsWorks for Chef Automate, or Beanstalk to deploy. Do more system-level tests.
call Lambda in CodePipeline (do backups, 3rd party integrations, setup/teardown resources)
use CloudFormation for infrastructure as code during pipeline
for multi-team applications could have one pipeline per team and a combined release pipeline
Database schema changes - more difficult than binaries since persists state but little logic.
3rd party solutions exist but generally: add a table to the DB to store version information, group schema changes in a file/code, examine DB version to see which change schema changes to apply.
Strategies for Updating Stacks:
prebaking AMIs - faster than bootstrapping (install apps on EC2 start) which helps auto-scaling, more consistent dependencies. Suited for CloudFormation and OpsWorks. E.g. query tag which indicates what version, if old, replace it.
in-place vs disposable upgrade:
in-place - typically for rapid deployment, consistent rollout schedules, sessionless applications or rolling deployment for stateful apps. Suited for CodeDeploy (to tagged instances or ASG) and OpsWorks (to layer)
disposable -EC2 instances are temporary, ensures no unknown dependencies. Suited for CloudFormation, Elastic Beanstalk, OpsWorks, or as part of auto-scaling configuration (update AMI and manually terminate old instances or wait for scale events to replace them using “OldestLaunchConfiguration” termination policy; or add another ASG behind ELB for new instances like immutable pattern)
blue-green -suited for Elastic Beanstalk, CloudFormation, OpsWorks to clone stack and deploy.
Hybrid Deployment Models: e.g. use CloudFormation to create Beanstalk app or an OpsWorks stack, provision with CloudFormation and CodeDeploy for app management.
Objectives: small incremental releases (good for microservices), extremely strong automated test scripts, controlling features (feature branches to be merged later, or deploy it with a on/off toggle)
AWS services:
AWS CodeCommit - code source repository
AWS CodeBuild - creates releases/packages (does checkout onto elastic compute, build, then upload artifacts to S3), can running commands (as if on laptop)
build frequency and speed examples:
weekly dependency upgrades (AWS internally says weekly) and check back into repo
nightly build with CloudWatch scheduled event to trigger nightly and email/Slack failures or use build badge (icon of build status). But, daily feedback is slow.
continuous integration/branch checks with CloudWatch events to build automatically on commits.
But if build takes an hour, still too slow (e.g. downloading lib dependencies into CodeBuild’s empty environment), can cache in S3 (expire with lifecycle policy) or “locally”. To expire, can do manually or via API call, or AWS will after some number of builds
pull request build - full build on pull request and require success for merge
supports VPC so could call Elasticache inside your VPC
parallel builds for speed (split unit tests from static code analysis)
AWS CodeDeploy - deploys revisions of packages. Two steps: i) create revision ii) create deployment
Deployment options:
EC2/on-premise: in-place, blue-green
Lambda: new lambda then traffic managed across using canary, linear, or all-at-once configuration
ECS: documentation says blue-green but new replacement taskset traffic managed across using canary, linear, or all-at-once configuration and then old original task set terminated.
not for S3 deployments
Relies on an AppSpec file which is naff for Lambda because you have to specify the current and target Lambda function versions (use AWS SAM instead)
vs OpsWorks: OpsWorks for Chef Automate (has chef), OpsWorks for Stacks (defines layers to provide abstraction), allows more complex architectures than Beanstalk.
cross-account: CodeCommit, CodeBuild (access ECR Docker artifact from another account), CodeDeploy (e.g. shared-services CodeDeploy assumes cross-account role to deploy into production account) and CodePipeline (still tricky to pass artifacts between accounts, steps need to carefully put artifacts in pipeline’s account, or account of next step)
AWS X-Ray - assist with debugging across distributed systems (trace request through AWS using a correlation ID). Graphical timing breakdown.
AWS CodeStar = project template to develop apps using the above (e.g. like what FarGate is for ECS). Examples for for EC2, Lambda, Elastic BeanStalk, SDLC (JIRA integration), collaboration, CodeCommit, CodeBuild, CodePipeline
Andrew’s hands on notes under “AWS ECR”
AWS CloudFormation
infrastructure as code as JSON or YAML, to power automated deployment, nesting for re-usability. Can customise resources controlled via SNS/Lambda
why infrastructure as code? reduce cost for manual infrastructure provisioning, reduce errors and increase compliance/standardisation/speed with automation
components
templates: JSON/YAML blueprint for resources of the environment (EC2 instances, LBs, etc)
stack: the actual environment created/updated/deleted by a template. Treated as a single unit
use same template but different stack for different environments using parameters to set sizing
change set: summary of changes/impacts generated by AWS to proposed changes
stack policy - JSON document applied to a stack to protect resources (e.g in PROD stack, can’t delete RDS, stop haywire CloudFormation template). Once set on a stack, can update via CLI only. Can’t remove it. By default protect everything (prevent updates) so need allows with explicit deny
Stack Set: share/manage CloudFormation templates (usually from one account) across accounts/regions to standardise deployment/permissions
change set - see impact of template change without actually doing it. Allows review process (developer creates change set and admin executes it)
gotcha: doesn’t tell you if change will be successful (doesn’t check account limits, permissions)
triggering action after deployment (e.g. start testing) - encapsulate trigger into Lambda, add custom resource with “DependsOn” on other instances which triggers Lambda. Example here.
importing existing resources into a (existing) stack - i) define a template that contains the existing and new resources to import ii)
limits: newly imported resources must have DeletionPolicy (but can have any value) and identified by specifying a unique ID during the import; only some resources support import; config in template doesn’t need to match actual config (will create drift that needs to be corrected later, i.e. multi-step process to import)
organisation
stacks can export values (“Output” section) for other stacks in the same region to import via Fn::ImportValue (e.g. stack A defines VPC, stack B puts EC2 instances ontop)
define reusable template snippets in S3 an include using Fn::Transform: Name: AWS::Include syntax. Gotchas: if S3 snippet updates, stack won’t know about it (check for change sets)
best practices:
split template by ownership/lifecycle (e.g. web site team owns web site resources, DBAs control database)
organisation: nested stacks (parent-child, creates new child resources) and/or cross-stack references (export/import values, e.g. network stack with VPC and subnets, cross-referenced by web app stack needing public subnet)
update AWS Python helper scripts (on EC2 instances)
make all changes via the template (if manual changes via CLI/console, could fail during next stack update) and version control it, use change set to check for issues
lint CloudFormation template with AWS CLI “ValidateTemplate”, cfn-nag or cfn-check tools. Pre-baked “CloudFormation Validation Pipeline” reference solution for this. Just like coding, small templates with tests to fail faster than one giant template.
Andrew’s hands-on notes:
don’t modify/delete outside of CloudFormation or else updates may get stuck (and then you have to wait an hour for the stack to timeout before you can delete/modify). Creates drift.
whitepaper:
resource lifecycle:
provisioning - CloudFormation
configuration management (i.e. upgrades)
could use CloudFormation to stand up new env and drop the rest (immutable infrastructure, no configuration drift) via blue-green
or incremental changes via EC2 Systems Manager or OpsWorks for Chef Automate
monitoring and performance - CloudWatch + alarms/SNS Lambda trigger. Use log agent to ship logs to CloudWatch
governance and compliance - AWS Config (change timeline, rules to check compliance and programmatic responses)
resource optimisation - Trusted Advisor (checks against best practice)
best practices - quality control (unit tests, static code analysis before deployment), idempotent infrastructure code allowing redeploy, make changes auditable/compulsory
re:Invent video
protecting / guard rails: i) termination protection on stack, ii) stack policies (see above), iii) resource-level policies, e.g. “DeletionPolicy” of “Retain” or “Snapshot” (for EBS volume, RDS instances, Redis, etc), iv) IAM (deny creating stacks with IAM policies)
monitoring stacks and changes:
configuration drift detection - drift (out of band changes to properties, default values (which aren’t explicitly in the template), or deletes). Introduces audit/compliance issues.
rollback trigger - during stack create/update, monitor for an alarm an automatically rollback
AWS Config for auditing stack changes. AWS Config rules to ensure stacks are publishing to SNS.
Management Tools
AWS CloudWatch - console give quick global picture of your cloud, performance metrics, dashboard
EC2 instances automatically have basic monitoring (5 min frequency) upgrade for $ to detailed monitoring which increases frequency to 1 min by default (also high resolution 1-second), and allows aggregation
CloudWatch agent - more detailed monitoring (e.g. memory), works on-premise
retention
data points with period <60 seconds available for 3 hours (high-resolution custom metrics)
data points with period 1 minute for 15 days
data points with period 5 minute for 63 days
data points with period 1 hour for 455 days/15 months
data will be aggregated after period, e.g. 1 minute metrics after 15 days aggregated to 1 hour. Once at 15 months it is deleted
about 12K-20K points
can’t delete, expire based on retention above. Deleted EC2/ELB keeps metrics for 15 months
if EC2 instance deleted or monitoring disabled, search limited to 2 weeks after metric last ingested (ensure most up to date instances are shown)
metrics at hypervisor level (CPU, disk, network, but memory not there)
CloudWatch events - helps respond to state changes, create rules to route events to targets, e.g. invoke Lambda func to update DNS entries when ec2 instance enters “running” state, direct specific API records from CloudTrail to Kinesis stream for analysis, periodically snapshot EBS volume
CloudWatch logs - agent sends logs to CloudWatch (e.g. Apache, app, kernel logs, CloudTrail events, or even to archive logs)
alarm - notify topic (e.g. email) if threshold in alarm state, or do EC2 action (e.g. if CPU <5% for a day, terminate the instance). Can be up to 10-second high resolution. Normally 60-second.
cost: by number of metrics (“basic” is 5-minute frequency, “detailed” is 1-minute frequency and costs 7x standard), num of alarms (60-sec standard resolution, 10-sec high resolution), num of dashboards, number of API requests, storage of lobs (GB-mo)
AWS CloudTrail = auditing users and API usages (not so much performance/health), when ec2 provisioned, changing things, NOT performance metrics (CloudWatch), have to turn it on
event history (default ) is for 7 days (create a trail for longer retention)
to process events (save to S3, but in naff .gz format) need to send to CloudWatch log stream and filter there
vs CloudWatch: CloudTrail is: operations, lower-level, no native alarming (can pipe into CloudWatch and create alarms there)
CloudWatch is: activities, higher-level monitoring/eventing
trusted advisor = scans env to save money, security, free tier and business/enterprise (but not free/developer) support tiers
areas of: cost optimization, performance, security, fault tolerance
free: 5x security and 1x performance checks
vs AWS Config? Trusted Advisor is not configurable, no automated remediation actions (not even notifications), doesn’t track historical results (so can’t trend), has to be run explicitly (vs AWS config runs as config is modified) but maybe cheaper (included as part of support vs AWS config $2/rule/month)
If Docker/Packer, this encapsulates config so you need provisioning more
Config management
Config management
Config management
Provisioning
Provisioning
Mutable vs Immutable Infrastructure Focus?
Docker represents immutable (replace container with brand new container) vs mutable (in-place upgrade)
Mutable
Mutable
Mutable
Immutable
Immutable
Procedural vs Declarative?
Procedural code doesn’t encapsulate current state of infrastructure (it’s just changes, more complexity), order matters.
Declarative code represents current state of infrastructure. Smaller code but less expressive (harder templating logic)
Procedural
Recipies are Ruby DSL
Declarative
Probably procedural (e.g. to scale in/out need to write code instead of just specify number needed) Modules are more declarative though
Declarative
Declarative
Master?
Master centralises config, monitoring, can run continuously to enforce state.
Master server by default
Master server by default
No master by default You SSH from whatever machine
No master by default Connect to cloud provider’s APIs
No master by default Connect to cloud provider’s APIs
Agents?
Having agents: how to bootstrap/upgrade/secure agents? Purpose-built agents standardise libraries/environments
Yes
Yes
State Management
How state is interrogated and whether it’s persisted
Chef Infra Client runs a recipie which defines the desired state and how to transition to this state
Facter agent interrogates node sending facts to puppet master.
Ansible using SSH/WinRM interrogates system to determine facts. Nodes can define custom facts in a file for Ansible to discovers.
Handled by AWS itself
Has a .tfstate file (need a backend to share it with team) which maps resources in configuration to the actually deployed resources, maintains depdendencies and knows if resource deleted. Allows Terraform to be cross-provider.
Combos:
provisioning + config management: E.g Terraform provisions infrastructure + Ansible for app deployment. Easy to start, but Ansible more procedural
provisioning + server templating: Terraform + Packer (package apps as VMs). Easy to start, immutable infrastructure. But VMs long time to build/deploy. Terraform doesn’t do blue-green deployment
provisioning + server templating + orchestration: e.g. Terraform, Packer, Docker, Kubernetes: Packer builds Docker images which Terraform provisions infrastructure and deploys. Forms a Kubernetes cluster of Docker containers which can handle deployment strategies, scaling, healing. But expensive/complex.
AWS Config
Helps with ITIL configuration management compliance
why? record config changes (audit history), define rules and trigger notifications, inventory (reference doc for metadata collected)
e.g. all EFS/EBS volumes connected are encrypted, backup enabled on RDS, CloudTrail is enabled
AWS Config Rules are either managed (run by AWS, prebaked rules) or custom (write Lambda function)
notifications: tracks resources against baseline: can trigger SNS which could remediate the issue
understands relationships between AWS resources (e.g. EC2 instance associated with security groups, records change to both resources)
gotchas: helps with compliance but doesn’t prohibit (i.e. rules evaluated after creation, not during creation), but can remediate using Systems Manager Automation document or notify to Lambda
vs Systems Manager Inventory (see “Systems Manager” section)
AWS OpWorks
managed Chef and Puppet (automation platforms). Config management for deploying, automation, configuration, upgrade. Maintains consistent state
flavours:
OpsWorks for Chef Automate: managed chef server (holds config, coordinates tasks), can manage on-prem, chef premium (chef workflow for CI/CD, chef visibility for dashboard of nodes, chef compliance for testing policy)
pay for num of nodes managed per hour, EC2 instance for server
components:
workflow: coordinate dev, test, prod, including quality gates
compliance: report Center for Internet Security (CIS) violations
visibility: visualise workflows and compliance
OpsWorks for Puppet Enterprise
OpsWorks for Stacks - AWS-creation compatible with Chef recipes. Supports on-prem. Based on chef-solo (no master server)
stack is collection of resources to provide a service. Layers are components within the app (compute = EC2, data = RDS, ELBs = balancing). Instance setup:
24/7 instances to handle base load plus…
auto-scaling based on two types: time-based (schedule for predictable demand) and load-based (e.g. scale up on 80% CPU, down at 60%, or memory or load average; Linux instances only
not ASG auto-scaling, feels like it’s own simplified version (to support on-prem): doesn’t actually create instances just starts and stops them (so you need to ensure instances already created), Windows not supported for load instances.
layers have lifecycle events which can trigger chef recipes: setup (after instance booted, e.g. to install software. Not online till setup completed), configure (run on all instances when instance joins/leaves, e.g. configure software load balancer), deploy, undeploy, shutdown
region specific. Can manage/clone within region (regions as a form of resource isolation)
vs OpWorks for Chef Automate:
Stacks are not full chef features, OpsWorks cookbooks might not be backwards compatible with chef (e.g. AWS specifics), region specific
Stacks more provisioning, modelling, deployment focus. Chef more configuration management but overlap exists
Systems Manager
centralised console and tools for managing large fleets (10s, 100s of instances)
organise: groups of resources (tagging), by layers (data, balancing, compute), by environment (prod, dev)
SSM Agent (AWS Systems Manager Agent) in most AMIs, or on-prem too
Inventory: collects OS, application info on EC2 or on-prem instances, e.g. which instances have Apache HTTP 2.2 or earlier, track licenses
metadata collected doco and custom too (but naff in that it’s static JSON document, not actually scripted to pull config, unless you created a task to do this yourself)
vs AWS Config: Config has historical changes, can evaluate rules and trigger notifications, but fixed metadata
state manager: creates and tracks state of resources, e.g. Apache HTTP 2.2 instances are “pre-upgrade”. Can run config management tool (PowerShell, Ansible) to upgrade state
logging: though recommend just stream logs (e.g. web server) to CloudWatch
parameter store: store global config data and secrets (integrates with Secrets Manager, KMS; CloudFormation)
maintenance window: define windows which other services (Patch Manager) can use
automation: trigger by maintenance window, response to CloudWatch, or invoked via console/CLI,
e.g. shutdown lower envs outside business hours (although AWS Instance Scheduler specifically for this now); automate creating AMI in standard way, CloudFormation to create/tear entire stacks.
Run Command: run script without RDP/SSH/bastion (more secure). Run script on multiple machines at once. Probably uses Session Manager under the hood.
restrict actions using IAM policies
Session Manager - browser shell (from console or CLI) to access Linux/Windows (on-premise too) instances without requiring open SSH/RDP ports open, manage SSH keys, bastion hosts (instance doesn’t need public IP). Needs SSM agent installed.
Patch Manager: OS and application patching EC2 instances, define rules to auto-approve patches (with exceptions), schedule patching during maintenance window
baseline: defines which patches auto-approve (e.g. Microsoft criteria and important (7 -days ago) auto-approve, but optional and important (newer than 7 days) don’t approve). AWS has pre-baked baselines (different names for each OS “AWS-AmazonLinuxDefaultPatchBaseline”, “AWS-RedHatDefaultPatchBaseline”, “AWS-WindowsPredefinedPatchBaseline-OS” (alias of “AWS-DefaultPatchBaseline”)
systems manager documents (SSM documents): JSON/YAML files for versioning. Flavours:
command document: used by run command (what to actually run) and state manager (do update config)
policy document: used by state manager to enforce policy on target (defines conditions for a state, e.g. Apache HTTP 2.2 or lower is “pre-upgrade”). If removed from target, policy action (upgrade Apache) doesn’t happen
automation document: used by automation (defines steps during automation, e.g. create AMI)
how? some are pre-baked (e.g. patch Linux AMI) or write your own (JSON/YAML)
e.g. creates CloudTrail in each account to log to central bucket; shutdown EC2 instances outside business hours
AWS Service Catalog
Framework allowing administrators to create pre-defined products/landscapes for users to consume
granular control over which users have access to which offerings e.g. setup groups mapping to a set of services
uses “adopted IAM roles” so users don’t need underlying service access
push button deployment consumption based on CloudFormation templates (“products” are versioned within a portfolio)
service catalog constraints:
launch constraint: IAM role that service catalog assumes when end-user launches product (i.e. user adopts this roles)
this is at product level within a portfolio (product-portfolio association), so could have multiple launch constraints too
not required but alternative is user having all of the IAM permissions
template constraint: one or more rules narrowing down selectable values during deploy (e.g. use cheap EC2 instances in DEV, ask if storing PII so deploy encrypted). At product-portfolio level. Affects all product versions for subsequent launches.
resource update constraint: whether user can update the resources in a provisioned product
in multi-account scenario:
can share service portfolio with accounts to be imported by other account’s admin and remain insync, or add their own constraints as a local portfolio
users, groups, and roles not in sync (need to add manually). Subtleties: imported template will have owner account’s launch constraint, so local admin could override as a local portfolio
Licensing
Microsoft licensing
License Mobility (benefit of Microsoft Volume Licensing) allows use on EC2 default tenancy (fill in forms), but don’t need this for EC2 dedicated hosts or instances
For Windows Server, need dedicated hosts (but not dedicated instances) to bring own license (as license is tracked to hardware)
AWS doesn’t sell Windows Client (7, 8, 10). BYOL on dedicated host
Dedicated Host (not Dedicated Instance) provides best BYOL experience
Oracle licensing
RDS does not support enterprise edition. You have to BYOL.
Cost Management
Terms:
Capital Expenditure (CapEx) - slow so usually do only so many years, have to forecast, large upfront)
Operational expenditure (OpEx) - pay as you use (lean would say pay for what you need)
Total Cost of Ownership (TCO or “Ownership” in AWS terms since cloud is different) - comprehensive look at entire cost of given decision/option, likely including soft (time, difficulty, complexity) costs too
Return on Investment (ROI) - amount can expect to receive back within a certain time given an investment
typical traps: on-prem TCO underestimated (fixed costs of a datacenter (rent, fire-suppression, insurance, compliance, depreciation) difficult to allocate, soft cost not examined); cloud has steep learning curve; IT assumptions can be buried within business cases
Cost Minimisation Strategies
appropriate provisioning - N+1 provisioning, horizontal scaling. Consolidate resources (one large RDS (see bullet below), one big DynamoDB table, multiple small containers on one EC2 instance that are too small to have their own; Lambda probably good if <40% utilisation). Shutdown unused things. CloudWatch can monitor utilisation.
i.e. use a consumption model
is consolidating smaller RDS into one RDS cost effective? Ignoring increased blast radius, generally cost of “large” is half of “2xlarge” which is half of “4xlarge” and generally get double CPU/memory. Somethings scale up nicely (T2/T3 baseline increases so more CPU and higher baseline), but some don’t (m5/r5 ECU tapers off, EBS/network bandwidth more but not double) so possibly not cheaper ECU/EBS bandwidth/network wise. But cheaper EBS volume utilisation or ECU utilisation too low for small instance types.
right-sizing - use lowest-cost resource that meets the spec. Iterative exercise (as prices drop, new instance types).
Need monitoring of resource (low utilisation are candidates) and end-user experience using 99 percentile over well-chosen time period
Buffer-based architecture by smoothing spikes with SQS (pull, FIFO, or once-only)/SNS (push)/Lambda/DynamoDB/Kinesis Data Streams (multiple consumers). Consider delay in publishing/subscribing, and duplicate handling
Schedule in new (spot) instances at peak.
purchase options - reserved instances for permanent apps (EC2 Hosts, RDS, Redshift, Elasticache, can also reserve CloudFront and DynamoDB too), spot instances for temporary workloads or scale-out. EC2 fleet lets you define target mix of on-demand, reserved, and spot instances
geography/region selection - trade latency for cheaper services
optimise data transfer - data going out and between regions can be significant. DirectConnect can be cheaper at a given data volume/speed. CloudFront is cheaper to send to edge location than to internet.
whitepaper:
measure overall efficiency - measure business output and cost associated to achieve it
stakeholders (CFO set budgets and reporting, business owner forecast and monitor usage, tech lead implement to financial goals, 3rd parties to align engagement to financial goals) involved in expenditure discussion
visibility and governance - breakdown spend (tags), utilisation (reports), and forecast via AWS Billing and Cost Management services. Alert when budget at percentage threshold.
governance via service limits and IAM policies (restrict resource creation)
attribute expenditure - help calculate efficiency and ROI via AWS Account structure (separate cost-centers, set service limits on an account, or get more aggregate services, isolate instance reservations)
entity lifecycle tracking - remove resources when not required (employee, project). AWS Config for inventory. CloudTrail and CloudWatch for recording lifecycle events. Use federation instead of creating IAM users.
optimising over time - monitor utilisation gap (excess capacity). Need a specific org function to do this. Set goals and monitor (reduce cost per transaction by x% every year; % of EC2 instances that are turned on and off in a day, 80-100% achievable; % of always on instances that are reserved capacity; % free of EBS volumes)
enforcement - AWS Config Rules. Could shutdown EC2 instances that are inappropriately tagged.
resource groups - group resources based on tags, by default AWS console ordered by service, but can organize by group to show info (region name, health checks, EC2 IPs, ELB ports, RDS). Customise console
group based on tag name (or portion of name) can nest (e.g. Developers > TeamA)
export as CSV, modify in tag editor
example grouping: by environment, project, business workload, department or cost center
Tools
AWS Budget - define limit and notify if budget exceeded. Not just cost, also usage (# of EC2/ELB hours) and reserved instances. Can filter to tags, good for making costs visible to everyone.
consolidated billing could have multiple linked AWS accounts (dev/test, prod, back office) all billed to one account, track different costs (still breakdown per account), get volume pricing discount across all linked accounts (e.g. S3 costs decrease, share reserved instances if both accounts don’t disable this feature)
paying account can’t access resources of the other accounts by default, linked accounts independent by default
limit of 20 linked accounts, but contact AWS to increase
gotchas: this is legacy, use AWS Organizations instead (easy to migrate to this, backwards compatible, and more powerful)
AWS Resource Access Management - share AWS resources (EC2 subnets, transit gateway, capacity reservations, licenses, clone Aurora cluster, Route 53 forwarding rules to resolve domain using on-prem resolver) across any accounts or within organisation
Non-AWS, is public cloud actually cheaper?
of course public cloud vendor will tell you it’s cheaper
probably not cheaper if you just lift-and-shift without modernisation (migration cost outweighs cost savings)
which is why AWS recommends comparing to future-state AWS architecture (horizontal scaling in/out, scheduled turn off of resources) and factor in inaction
running old processes in cloud will be more expensive (not being agile, DevOps, not embracing automation, not embracing innovation/experimentation)
Gartner blog highlights: migration expensive, ROI only positive after efficiencies realised (right sizing, scheduled shutdown, reserved instances)
IT spend could increase as savings re-invested into modernisation/re-architecture
cost savings more long term optimisation goal, real benefit is flexibility/agility (resilient to external forces, productivity, innovation).
Application Integration
AWS SQS - Simple Queue Service
Reliably store messages on a queue while waiting for computer to process them, loosely coupled applications together, buffer in multi-producer/multi-consumer (which don’t need to coordinate* see limits below)
polling, designed for high availability and guaranteed delivery (app has to delete the message from the queue and handle dupes)
queue types
Standard Queue
FIFO Queue
Performance
Unlimited
High 3000 messages per second with batching (300 without batching)
Ordering
Best-effort
FIFO
Delivery
At-least-once
Exactly-once (consumer has to delete it)
Max messages in-flight
120,000
20,000
Message failure
Retries until retention period expires or sent to dead-letter queue
Bad message blocks consumers. Can use dead-letter queue.
FIFO concepts:
message group ID can be used so that different producers use different message group IDs therefore preserve order within a producer
dedeuplication ID - either a hash of the message content (not attributes) or you provide it which prevents dupes within a deduplication interval (min 5 minute)
used for exactly once processing - if post again, won’t be added
VisibilityTimeout - when message pulled by an application, other apps won’t see it during timeout window, default 30 sec (max 12 hrs), (helps prevents dupe processing but doesn’t eliminate it, e.g. if node takes a long time, so times out and is picked up by another node), on timeout will be visible again
can secure data with KMS
retention period of 4 days by default, 14 days max (SWF is 1 year)
handling problematic messages with dead-letter queue: redirect messages to this queue after message fails to process a specified number of times (specified in redrive policy)
don’t use dead-letter queue for: FIFO queue where exact ordering is critical, want to retry during entire retention period
alarm on queue size to auto-scale queue consumers
cost - on message size (in 64kb chunks), #requests (first 1M free, $0.4 per M per month for standard, $0.5 for FIFO)
patterns/examples: meme generator puts to queue, app picks up generation request and publishes it
application: async pull task message from queue, get the file from S3, process the file, write back to S3, write “task complete” to another queue, delete original task message, loop and check for more tasks
limits
1 request can have 1-10 messages with max 256kb total payload, 256kb message sizes, max 120K messages in-flight per queue (store actual data in S3 if hit limit and could store metadata in DynamoDB - Java SDK can do this for 2GB payloads)
max 12 hr visibility, “at least once delivery” (app needs to handle duplicates or use FIFO)
no priority (workaround with a different “priority” SQS queue), random order (unless FIFO)
vs
SNS (see “AWS SNS” section)
vs Kinesis Streams (see “Kinesis Data Streams”)
architecture tips
batch receive/delete or put up to 10 messages
process oriented messaging (delete old message from “from” queue, update message and add it to “to” queue) vs document-oriented (single message per user/job that is annotated as it flows)
long versus short polling:
short polling (default) when ReceiveMessage API call has WaitTimeSeconds=0 or queue attribute ReceiveMessageWaitTimeSeconds=0. Quickest response, but inly polls a subset of servers (could be empty or false empty responses)
long polling gives more time for SQS service to wait for messages in queue up to a timeout (max 20 seconds; be careful of connection timeouts), reduces request count (which reduces $)
AWS SWF - Simple Workflow Service
Manage parallel or sequential services ( basically organizes tasks), glorified status tracking system (no drag-and-drop).
what for? BPM, workflows with manual review, external processes, or specialised logic
task based, ensure no dupe allocation (not delivered more than once), tracks tasks, could be humans too (not just AWS services)
actors:
workflow starters - app that initiates a workflow (e.g. eCommerce web site places an order)
deciders - controls flow of activity (order, concurrency, scheduling), when finished (or failed) decides what’s next
activity tasks/worker - program that interacts with SWF to get tasks, process them, return results, and update its status to SWF
have to use polling to see if there are things for them to do
retention period of up to 1 year (SQS is 14 days), pay per 24-hour period
related workflows form a “domain”
signal - interrupt running workflow with new information, could be an external source (e.g. pause until signal received, change order fulfillment process when customer cancels order or updates their address). Could signal other workflows.
cost: per workflow started, number of open/retained workflows in 24-hour period, per combined task/signal/timer/maker used
vs
SQS: SQS is message-oriented, retention is 14 days (SWF 1 year), difficult to coordinate multiple queues
SWF: task-oriented, strong deliver once semantics, deliver once (task has unique ID), tracking of tasks/actors, model human-intervention tasks
AWS Step Functions: Step Functions are newer, recommended (assuming it has the features you need), has state machine in JSON instead of decider program (graphical, so simpler?), does retries
SWF: more complex/flexible, program complex logic in your language of choice. Flow Framework has programming constructs to structure asynchronous requests. Supports external signals, and child processes (Step Functions can decompose Step Functions since 2019 too) that return results to a parent.
define apps as a state machine (Amazon State Language declarative JSON. Can decompose into multiple smaller state machines (create a library, avoid duplication)
good for: out-of-the-box coordination of AWS services, e.g. order processing flow (without human-intervention), atomic coordination of microservices
supports human-intervention, e.g.pause task with task token and email out for approval, email has link backed by API Gateway which passes the token (i.e. callback)
has visual UI to describe flow and real-time status, can search for and drill-down
apps interface and update stream via Step Function API. Tight integration with Lambda and AWS services
cost: per state transitions
limit: max 1 year task/workflow duration
vs
SQS: SQS message-oriented (you build your own application logic/tracking), connects two apps. Step Functions allow coordination across components (status, logging), visualisations
SWF: see “AWS SWF” section
AWS API Gateway
create, publish, monitor (CloudWatch), and secure (authentication/authorisation APIs at any scale, acts as a “front door” for REST APIS (EC2, Lambda, or web app) - aim at mobile/web
cost: $/hour for caching (if enabled), number of requests for REST or number of messages transferred for WebSockets
api caching - cache API responses till TTL. Size is 0.5 to 237 GB. Caching at the stage level. Choose keys based on query parameters. Invalidate cache manually from console or client sends a Cache-Control header
maintain API keys, throttle (return HTTP 429), usage plans/quota management, can monetise by publishing to AWS marketplace
vs CloudFront caching - CloudFront: doesn’t cache POST/PUT/DELETE/PATCH, charges per request (instead of by hour)
CORS (cross-origin resource sharing) - allow API gateway to be called from different domain. When enabled API gateway responds to OPTIONS calls with headers Access-Control-Allow-Methods, Access-Control-Allow-Headers, Access-Control-Allow-Origin
custom authorizer aka “Lambda Authorizer” (e.g. Lambda to validate query params or OAuth/SAML/JWT token)
endpoint types:
private - non-internet facing, VPC only
regional - deployed to only one region (endpoint is region-specific), so in-region requests are lower latency (no CloudFront). Can put your own CloudFront infront with WAF. Can use latency based records to have multiple deployments in different regions with the same config (custom domain name)
edge-optimised - default for REST API , deployed to a region and uses hidden (you can’t control) CloudFront (with no WAF) to reduce latency for geographically dispersed clients
monetise with Serverless Developer Portal.
how? setup developer portal (CloudFormation) for customers to sign up, you publish API gateway API to developer portal (assign “usage plan” to stage). Register usage plan in AWS Marketplace by submitting “load form” with an “apigateway” dimension of type “requests” (don’t need to register as a seller???). Update usage plan with product code from Marketplace.
Customer API keys associated with usage plan. Should use more than API key for authorisation (API key is for all APIs within usage plan), i.e. use IAM role, Lambda authorizer or Cognito
limits: region specific, always internet facing; 30 second timeouts (hard limit), 10K requests per second (+bursts though), 60 APIs per region
AWS SNS - Simple Notification Service
Web service to set up, operate, send push notifications, pub-sub
topic = “access point” allows multiple recipients to get copies of a notification
can deliver to multiple endpoint types (e.g. group iOS, Android, SMS text recipients). Deliver once to topic, goes to all delivery recipients
Lambda, HTTPS, SMS text message, SQS (standard queues only, non FIFO), email (SES), Amazon Device Messaging (push notifications)
published messages stored redundantly across AZ
usages: when auto-scaling goes up/down, push to mobile devices SMS text, email, to SQS, trigger Lambda
for email, recipient has to confirm (similar to verify email address). In console won’t be assigned subscription ARN (will say “PendingConfirmation”) or ID until confirmed
test via the console, TTL for mobile devices (e.g. undelivered because phone is turned off, drop if TTL expires)
cost on:
#publishes (1st 1M free, $0.5 per 1M SNS requests) - includes admin (API calls)
#deliveries (mobile push: $0.50 per million, email: $2.00 per 100,000, HTTP/S: $0.60 per million, SQS/Lambda: free
GB of data out (first 1GB/mo free, <10TB/mo $.09/GB, and scales down)
patterns:
fan out" where SNS sends to a topic, consumed by many parallel applications
SNS to SQS to reliably send messages to consumes
vs SQS: both considered “messaging”
SQS is polling, doesn’t require all consumers to be concurrently available, supports KMS
SNS is push, simple API, easy to integrate, flexible delivery over multiple transport protocols, cheap, no storage (if no subscribers, it’s dropped)
limit: 10M subs per topic; 100K topics per account
Other things:
Amazon MQ - managed Apache ActiveMQ (industry-standard), so for existing applications MQ provides easiest cloud-migration path
not as flexible as managing on EC2 yourself, doesn’t support VPC endpoint (PrivateLink)
vs SQS: use SQS for new apps (tighter AWS integration, better cloud features)
ses simple email service = mail marketing, can receive, integrate with lambda, s3s
might be covered
elastic transcoder = transcode media, figures out best transcoding settings based on destination
video transcoding/publishing: upload to s3 -> triggers lambda to transcode using elastic transcoder, create thumbnail ->saves to s3
cost minutes and resolution of transcoding
Business Applications and End User Computing
Just need to know they exist and what they do. Troubleshooting likely not covered.
AWS WorkSpaces = virtual desktop in cloud (run Windows/Amazon Linux in the cloud, DaaS, desktop as a service). Can connect from PC, Mac, Chromebook, iPad, Kindle Fire, Android using special client. Don’t need IAM account, could be federated (i.e. set up by companies IT dept). Remote desktop into it.
Windows 7 or 10. Local admin policy so can install software (can be locked by IT), workspaces are persistent, data on D: is backed up every 12 hour
Why? Highly regulated environment that needs isolation, data leak prevention, need high powered machine when desktops are thin clients, provision desktops for temporary/remote workforce, DR for knowledge worker’s machines, demo products
Workspaces Application Manager (WAM) - deploy (package, upload, deliver) and manage (users/groups, versioning, updating) Workspaces. Two tiers of pricing.
AWS AppStream = stream Windows apps (not whole desktop) from the cloud without code modification, deployed and executed in AWS, output streamed via browser
good for: computational heavy apps to be rendered on lightweight client
on-demand and always-on fleets. Simplifies application manage with a base image. Can set max duration.
AWS Connect - contact center with configurable call routing (phone tree), inbound/outbound telephone, interactive voice responses (i.e. voice/text chatbot with Lex), analytics, CRM integration
AWS WorkDocs = enterprise document sharing (OneDrive, Dropbox for enterprise) collaborative/sharing of documents.
AD for SSO
web/mobile clients (but no Linux), SDK available
HIPAA, PCI DSS, ISO compliance
WorkMail = AWS version of MS Exchange (compatible with MS Exchange)
WorkLink - access work intranet site without VPN (which is painful on mobile). It renders intranet site on AWS, then returns vector graphics to client
how? users authenticate federated (e.g. SAML). Web app is rendered on a resource inside VPC, then vector graphics streamed to users mobile device (iOS/Android app to ensure no caching)
Alexa For Business - Alexa at work, personal assistant (Alexa skill to find/book meeting rooms). Helps manage fleet of Echo devices
Need to be able to select right service for the exam. Overview:
ML frameworks and infrastructure for ML researches/academics who need raw compute power: EC2, GreenGrass (moves AWS cloud closer to edge devices, IoT), AWS Deep Learning AMIs
ML services for data scientists and ML developers: SageMaker (build, train, deploy ML models; replaced AWS Machine Learning) which includes:
managed Jupyter notebook (web app to create/share live code, visualisations)
endpoints to deploy ML models for consumption
AI services for app developers without ML experience:
Comprehend: natural language processing of text, e.g. sentiment analysis of reviews to determine if customers are happy
Comprehend Medical to extract medical diagnoses, medications
Forecast: given time-series data, make predictions, e.g. predict sales/demand
AWS Lex - conversational chatbots (e.g. Alexa, customer service chat bot) with voice and text. Comprised of:
intent - goal user wants to achieve
utterances - spoken or typed words that invoke the intent
slots - data user must provide to fulfill the intent
prompts - questions that ask the user for input
fulfillment - business logic to achieve the the intent
vs AWS Chatbot: Chatbot is 2019, purpose built to monitor and interact with AWS resources (ChatOps). Slack/Chime integration.
Personalize - recommendation engine based on demographics/behaviours, e.g. upsell at checkout, Amazon marketplace
Polly - text to speech, generate response to callers
Rekognition - understands content of images/videos, identify objects, identify motions/gestures/facial expressions, e.g. flag inappropriate content
Textract - OCR, e.g. digitise scans of paper forms
Transcribe - speech to text, e.g. create captions, transcripts
Translate - language translator
DeepLens / DeepRacer - learn about machine learning applied to video and autonomous driving respectively. Comes with actual hardware.
Analytics
AWS Kinesis
Streaming, consuming/collecting/storing streamed, handling hundreds of thousands of producers (e.g. social media feed and find positive/negative views, stock prices, game data, geo data for maps, IoT sensors) and store/analyse it.
Kinesis Data Streams - receives data from data producers (hundreds of thousands of sources, small data size (KBs) but at total is TBs/hour - stream to custom apps (e.g. Lambda) or Firehose for storage. Basically Apache Kafka.
capacity = shard: partitions data. All data from a shard processed by the same record processor (make aggregation, filter easier). Default limit of 500 shards but can change to unlimited
each shard: at 1000 PUT records/s (up to 1MB/s input) and 2 MB/s output.
GetRecords from a single shard returns min of 10 MiB or 10,000 records. Each shard supports 5 reads per second. Still limited to 2 MB/s output (so if one read returns 10 MB, next GetRecord calls within 5 seconds will error).
can use “enhanced fan-out” so each consumer gets it’s own dedicated up to 2 MB/s output (costs extra per-hr and data processed)
data record stored from 24 hours (default) to 7 days. Contains partition key, sequence number (keeps in order of puts; partition key + sequence number is unique ID of record, but beware retries from producers) and data blob. Max size is 1 MiB data blob.
has high-level libraries for producers/consumers (KPL, KCP) vs AWS API (which has create/delete of streams)
duplicates tricky: sequence number not enough as there could be producer retries with different sequence numbers. Sink needs to handle it based on business ID. Possibly DynamoDB holds list of recently processed items.
cost per shard-hour and per 1M PUT units (25 KB chunk)
gotchas: 1 MiB size prevents images being streamed
vs SQS:
Data Streams better for partitioned data: map reduce (processing related records on one processor), shipping application logs in order (i.e. maintain order with sequence ID), multiple consumer apps (Redshift and visualisation together), processing data by one app then again later (e.g. processor app followed by audit app up to 7 days later)
SQS better: process data completely independently (ACK/NACK means no need for check-pointing), FIFO for exactly-once processing, can introduce delay, easier to scale consumers (don’t need to pre-provision shards), has dead-letter/poison-pill management.
Kinesis Data Firehose - receives data from producers, analyse/process with Lambda (reformat Apache Logs to JSON, compression), then load data into AWS S3, Redshift (via S3 first), Elasticsearch, Splunk, or Kinesis Analytics
How to write to Firehose: integrations with Kinesis Data Streams, CloudWatch Logs/Events, AWS IoT, Kinesis Analytics, Kinesis Agent (Java app which monitors files, handles file rotation, retries) or AWS SDK.
No shards.
Cost base on amount of data ingested (records rounded up to 45 KB). Can also do transformation of JSON to Apache Parquet or ORC (costs extra based on amount of data processed).
24 hour retention (either sent to Lambda or S3 immediately. Ingestion failures can go to S3, e.g. ElasticSearch).
limit = 1000KB record (before base64 encoding), 20 streams per region. Each stream: 2K transaction/sec, 5000 records/sec, 5MB/sec.
vs Data Streams + Lambda to write: Firehose has buffer size (MB) and interval (seconds) to make output writes more efficient. Tighter input/output integrations with other AWS services. Lambda for transformation (Data Streams sends to Lambda as a destination). Seems faster (Firehose transactions and records per sec faster than 2 data stream shards), so could be front door for ingestion (store in S3 and send somewhere else too).
Kinesis Data Analytics - real time analytics using SQL-like query or Java code. Store results in S3/RedShift (not mentioned in documentation)/ElasticSearch (via Firehose). Similar to Apache Spark
cost per Kinesis Processing Unit-hour (1 vCPU with 4GB RAM). Java processing is 1 extra KPU per application
integrations with Kinesis Data Streams and Firehose on ingestion and Firehose for emitting processed records
Scripting: SQL, Java. Lambda for pre-processing and destination
seems more like a transformer with Apache Flink (stateful computations on streams) vs visualisation (Kinesis Data Analytics has none, have to write out)
cost: KPU (1 CPU and 4GB RAM) per hour, plus online storage GB
vs Data Streams + KCL: Data Analytics uses KCL (Kinesis Client Library) under the hood but less flexible, although handles checkpointing (see Apache Kakfa) for you
Data Streams
Firehose
Analytics
What
Topic/stream
ETL
Real-time analytics. Code runs in Kinesis.
Cost
- per shard-hour - PUT units
- per GB ingested - per GB transformed - per GB and hour processed in a VPC (optional)
- per KPU/hour (compute unit) - per GB-mo storage
Getting data in
- KPL (Java with C++ module, actually a separate process) - AWS SDK (low-level, KPL recommended over this) - Kinesis Agent (Java app monitors files and sends them)
but: no scaling of brokers as of Sep 2019 (on roadmap; can scale storage)
pay per broker-hr (basically EC2 hours) and storage (GB-mo)
QuickSight = BI service (cognos competitor)
AWS Glue - create data catalog/table (i.e. storage-independent schema) and crawler to populate the table (from S3, JDBC, DynamoDB) which run on-demand or schedule. Also define ETL jobs.
pricing across: Data Processing Unit per hour (DPU, a 4 vCPU 16 GB RAM of processing) for ETL and crawler runs, data catalog (objects stored and requests).
ETL based on Apache Spark supporting Scala (presumably Java too), Python
AWS Lake Formation - Extends AWS Glue. Catalog data from source (S3, RDBMS, NoSQL), cleanse (Glue ETL, apply Machine Learning scripts) and store in S3, to be consumed by Redshift, Athena, EMR. Lake Formation magic: set access controls.
AWS EMR Elastic MapReduce = collection of products for big data processing (Elastic MapReduce helps wrap/coordinate these products), uses Hadoop (can add Spark, HBase, Flink, Presto), auto-scales across EC2 instances.
use for: log analysis, financial analysis, ETL
concepts:
cluster step - a Hadoop MapReduce application similar to one algorithm that manipulates data (e.g. counting words in text and then sorting are two steps). Could be Hive SQL script, custom JAR, or Pig script
cluster - collection of EC2 instances to run steps.
types of nodes: Core (run tasks, have persistent data in HDFS). Task (optional, only runs tasks, on-demand/spot-pricing ephemeral storage). Master coordinates and not fault-tolerant (if dies, whole cluster dies). Scale out core/task nodes for storage/compute
data input/output is read/stored on S3
to process more data, Hadoop needs more nodes and has a replication factor ~3x plus maintain 40% free for intermediate data (e.g to store 5TB of data on m1.xlarge which has 1.67TB, need 5 * 3 / 1.67 / (1 - 0.4) = 15 nodes).
can attach EBS volume (for more disk) but data is not persisted. Storage types:
HDFS is considered ephemeral (useful in cloud where clusters are disposable) and just temp space for intermediary results
EMR File System (EMRFS) - extends HDFS to allow access to S3 objects
input split size defines how many logical splits of the input file there will be. Controls how many “map” instances there can be (i.e. amount of parallelism)
cost - additional per-hour price ontop of EC2 pricing (~+25%)
data pipeline = moves data from one service to another. Schedule EC2 instances to start and process work, e.g. log processing
vs Simple Workflow Service: Data Pipeline specific for data-driven flows, i.e. trigger when input meets criteria, copying/moving data between different compute resources or data stores, chained transformations
AWS ElasticSearch Service (ES) = managed elasticsearch with Kibana, e.g. log analytics. Elasticache, Logstash to load data, kibana to visualise (ELK). Could replace LogStash with CloudWatch, Firehose, IoT
Why? search engine (as index for free text documents), analytics, visualisation,
Cost: on-demand special EC2 instances, EBS volumes
Vs QuickSight: ES standard. QuickSight has tighter AWS integration.
CloudSearch = create/scale search services on a web site, autocomplete, highlighting, geo-spatial
not covered
Analytics And Storage Asides
data warehouse vs big data vs no-sql
http://www.b-eye-network.com/view/17017
big data is
technology capable of holding very large amounts of data.
technology that can hold the data in inexpensive storage devices.
technology where processing is done by the “Roman census” method.
technology where the data is stored in an unstructured format.
data warehouse is
Immon (top-down) vs Kimball (bottom-up)
assuming Immon:
subject-oriented, nonvolatile, integrated, time variant collection of data created for the purpose of management’s decision making.
provides a “single version of the truth” for decision making in the corporation
provides scale, redundancy (at the cost of consistency but this can be tuned)
unstructured data fits better with object-oriented programming
schema is flexible
referential integrity?
relational still good for
tabular data that is complete and doesn’t change often, e.g. timetable, spreadsheet
data warehouse, OLAP reporting
big data implementations
Hadoop - distributed file system and processing built from commodity hardware, mapreduce, term can mean ecosystem (additional packages installed, e.g. Spark), master/worker roles
Hadoop HDFS (Hadoop Distributed File System) stores files across nodes with replication, designed for large static files, my struggle with large num of small files, tailored for MapReduce/analytics
Hadoop MapReduce - distributed processing
Oozie - workflow scheduler for Hadoop
Pig - high-level scripting of data analysis compiled to mapreduce programs
Hive - data warehouse/SQL over Hadoop/HBase. Execution via Spark or MapReduce
Mahout - machine learning/linear algebra/solvers for Hadoop
Ambari - management layer to provision/manage/monitor Hadoop cluster
also enterprise support for this ecosystem (contribute to open-source). E.g .HortonWorks, Cloudera.
Zookeeper - resource coordination
HBase - columnar data store ontop of Hadoop HDFS or Alluxio
Sqoop - “SQL to Hadoop”, CLI to facilitate data import from other RDBMS into Hadoop
MapReduce - programming model to parallel/distribute large data set for scalability and fault-tolerance.
Algorithm:
map = worker node does filtering/sorting (e.g. sort students by first name into queue, one name for each queue). Queue is temporary storage. Master node ensures only one copy of redundant data is processed
shuffle = worker node redistributes data based on queue (so all work for one queue/key is together)
reduce = workers process data in parallel, summary operation (e.g. count students in a queue, to get frequency of name)
need to optimise communications to be efficient (e.g. compute close to the data)
Apache Kafka - distributed streaming platform
general functions: pub-sub (but better, see below), store streams in fault-tolerant way, process streams in real-time
features: multi-tenant, guarantee order within a partition, fault tolerant via “checkpointing” (need to periodically persist, to say Zookeeper (coordinate distributed DB), where in stream we’ve processed in case of crash, complications of )
pub-sub: traditionally queues only have one consumer (so don’t scale) or broadcast to everyone (inefficient) without guaranteeing consumers process in order
good at set-oriented operations (searching, counting)
not traditional row-oriented storage with an index of every column (that implies index is from id->data), but instead data is the key. Allows good compression (not sparse data, columns similar). Writing more complicate
implementations: AWS Redshift
no-SQL vs RDBMS design principles
RDBMS typically flexible to query (data normalisation and can be changed later (indexes, views, materialized views) once access patterns are known) but doesn’t scale
No-SQL typically very efficient to query in a limited number of access patterns and expensive outside this. Therefore access patterns (amount of data stored, throughput), what attributes required is critical so that partitions set correctly
keep related data together, fewer tables better
normalisation (need to join on read, or split apart for writes) and ACID (slows throughput, requires synchronisation) barriers to scalable RDBMS.
AWS AppSync - sync data between mobile devices and AWS (DynamoDB, Elasticsearch, or Lambda) using GraphQL and handling offline devices
Migration Services and Cloud Adoption Framework
Migration strategies:
Re-host (“lift-and-shift”) - move resources with no changes (e.g. move MySQL on-prem to EC2 instances)
easy but minimal benefit/flexibility
Re-platform (“lift-and-reshape”, “lift, tinker, and shift”) - move assets but change underlying platform taking advantage of cloud services (e.g. move MySQL on-premise to RDS)
moderate difficulty, moderate benefit
Re-purchase (“drop and shop”) - abandon existing and purchase SaaS (e.g. abandon legacy CRM and migrate to salesforce SaaS)
easy but limited flexibility. Whitepaper considers more expensive than rehost with similar benefit
Re-architect - dramatic redesign with cloud-first/native approach (e.g. go serverless)
hardest but most benefit/flexibility/performance/scalability
Retire - get rid without replacement
Retain (“revisit”) - do nothing now. Perhaps requires on-prem refactoring first before migration. Revisit later.
TOGAF - The Open Group Architectural Framework - approach for designing, planning, implementing, governing enterprise IT
Fortune-500 favourite, defacto enterprise architecture standard
Critiques - enterprise architecture is not TOGAF (need more, but often fills a vacuum) so victim of unreasonable expectations
not a cookbook, taken too literally, some practitioners suck
What is a framework? Is: information to help organise thoughts, open for localisation/interpretation, should be adopted into organisational culture. Not a literal recipe for success.
Phases of Cloud Adoption:
project phase - run projects to get experience of cloud benefits
foundation phase - expand benefits beyond projects. Create: cloud center of excellence (CCOE), operations model, security/compliance readiness, landing zone (pre-configured, secure, multi-account AWS env)
migration phase - migrate existing apps (including mission-critical) or entire data-centers to cloud, IT portfolio increases
reinvention - once in cloud have speed/flexibility to reinvent (e.g. retire technical debt) to transform business for speed/flexibility
Cloud Adoption Framework
not just technical:
business - business goals, business case, measure TCO, ROI. Why migrate to cloud? How does cloud affect existing IT strategy?
people - culture, re-evaluate roles in cloud-future world, training, aligning goals
governance - portfolio management tailored for determining cloud eligibility/priority, program management, agile projects, align KPIs with business
platform - standardising resource provisioning, publish/reuse architecture patterns and align to cloud-native, new skills to leverage cloud
security - new IAM modes, logging/audit capabilities evolve, shared responsibility model removes some things
operations - monitoring needs to be automated, performance management to scale elastically, business continuity and DR has new cloud features
doesn’t include:
communication with external customers - migration should be transparent
Evaluating cloud migration needs: i) accurate inventory of on-prem, ii) verifying current on-prem ownership costs, iii) re-architect tightly coupled interfaces to be loosely coupled
not: ensuring change management process in-place (just evaluating phase), tie individual performance objects to cost savings goals
Hybrid Architectures - combines cloud and on-premise resources
common pilot for cloud migrations
cloud-infrastructure can augment or extend on-premise. E.g.
VMWare can deploy VMs to on-prem or cloud (VMWare has plugins to support AWS migration, if on-premise server overprovisioned)
stored gateway volumes use AWS for storage but no-changes on on-prem clients
ERP with middleware sending events to SQS to do analytics into DynamoDB
loosely coupled architectures allow services to exist anywhere
Migration Tools
Storage: AWS storage gateway and AWS Snowball
AWS Server Migration Service: automate migration of on-prem VMWare vSphere or Microsoft Hyper-V/SCVMM VMs to AWS
VMs replicated to AWS, volumes and kept in sync, creates AMI periodically
can provide DR, can minimise migration downtime since they’re incrementally sync’d
how: Server Migration Connector is a virtual appliance for vSphere/Hyper-V
limits: Windows/Linux only (since AWS only supports these OS)
AWS DataSync - helps on-premise (NFS share, SMB) to AWS (EFS, S3 (metadata compatible with File Gateway), Glacier, WorkDocs) data migration. Handles transfer acceleration, encryption, verification, error notifications. Uses a special VM.
pricing: additional per GB transferred ontop of S3, EFS.
AWS Database Migration Service and Schema Conversion Tool
convert between vendors/data types (with Schema Conversion Tool, SCT), parallel transfer, replicates source database changes during migration to the target (RDS or EC2 instances)
SCT typically for larger, more complex migrations, e.g. data warehouse - doesn’t support ongoing replication
for migration assessment, schema conversion, and application code conversion steps
can be used independently with(out) DMS
AWS DMS typically for smaller migrations (<10TB). About data replication between source/target
use cases: modernise (switch to open source/Aurora), migrate (into AWS, upgrade DB version, EC2-Classic to VPC, consolidate shards into Aurora), replicate (create cross-region read replica, keep dev/test in sync)
supports on-prem/on-EC2, everything on RDS, MongoDB (source for DocumentDB), SAP ASE, IBM DB2 (source), S3 (source (CSV with JSON metadata so know how to read) and target (CSV/Parquet), likely used as intermediary for large data or warehouses)
Snowball: for initial load and cache transactional changes until RDS databases created and then apply them
uses replication instance with a replication task to replicate data from source to target via proprietary Change Data Capture (CDC) process (platform agnostic but does use source’s built-in functionality so do need to enable some features)
replication strategies:
full load - migrate existing data creating tables if required, needs outage (no writes allowed) to do initial copy E.g. for loading PROD into a PRE-PROD testing env
full load + CDC - migrate existing data and ongoing changes
CDC only - changes only when native tools for copying better. Initial copy is a select * (which allows flexibility to filter out rows/tables, but slow)
can use SCT’s schema mapping rules (JSON file)
gotchas:
migrating full LOBs slow (can’t pre-allocate memory so migrated one-at-a-time in chunks). Can also truncate (limited LOB), or not migrate LOBs
doesn’t create foreign keys, secondary/unique indexes, views, stored procedures/functions/packages, etc. Use SCT to generate DDL script and apply it
whitepaper:
common steps:
migration assessment - includes network/application diagram, which bits auto migrated and estimates for manual. Specifically a Database Migration Assessment Report:
maps source to target schema objects and action items for them (from fully automated, manual select data types/attributes, manual rewrite)
recommended target engine based on source and features
AWS recommendations (substitutes for missing features, advice on saving on licensing costs)
Sharding for very large databases
schema conversion
convert the schema (DDL) which produces a script. Version control it. Probably want to do iteratively, and apply secondary indexes later (to make insert faster).
can apply mapping rules (rename, add/remove prefix/suffix), change column’s data type, rename tables.
then apply DDL to target
application code conversion - for SQL in reports, ETL, application code. Steps:
run assessment to determine effort
analyse code to extract SQL statements. SCT attempts to do this
use SCT to convert as much as possible (can do prepared statements)
manually fix remaining action items
save code changes
migrate data - using DMS. Apply indexes after
testing - ideally run automated tests against both source and target. Inspect data directly.
data replication using DMS
go live - have rollback plan, monitor, use multi-AZ RDS, re-enable backups on target
AWS Application Discovery Service
gather information about on-premise data-centers to help plan AWS migration
collects: config, usage, behaviour data. Can estimate TOC of running on AWS
how: for VMWare (agentless), for non-VMWare (agent)
limit: OS support is Windows/Linux
AWS Migration Hub - project-console across all services above for AWS-migration of servers (discovery and migrate). Integrate with DMS, SMS, and 3rd party tools for SaaS migration.
AWS Managed Services - follows ITIL, controlled change management (need to lodge change requests, changes during window, no one has persistent access). Prebaked templates (presumably CloudFormation) for infrastructure setup. Has landing page. Not true “managed service” (more for infrastructure, you need to manage config).
pay per percent of AWS usage.
Network Migrations and Cutovers
CIDR reservations - can’t overlap. VPC is /16 (65,024) or /28 (16 address) - but 5 IPs are reserved in each subnet
VPN connection is typically first foray into AWS, then progress to Direct Connect with VPN as backup
easy to move to Direct Connect with BGP routing - configure both within same BGP prefix and weight Direct Connect higher)
AWS Snow Family - evolution of AWS Import/Export - move massive amounts of data
accelerate transfer using storage device, bypass internet, using AWS internal network, into EBS/S3/Glacier, export from S3
AWS Import/Export Disk - you send storage device, and they send back to you, when to use it (depends on quantity of data and internet speed, table here)
cost = per device fee, time to transfer ($/hour), return shipping
AWS Snowball - only some regions, petabyte-scale transfer, big briefcase AWS sends you, 50 or 80TB snowballs, s3 only, simpler than shipping them a disk (no hardware purchased, security done), always recommended over disk (easier for AWS, not on frontpage anymore)
why? where network slow/expensive, share with customer/associates
bad: if transfer <1 week over internet (just use internet), ongoing data transfer (use Direct Connect)
100 Mbps = 7 TB (or 1 TB per day; or 45 GB per hour)
interface is command line tool (special “cp”) or subset of AWS CLI S3 command, NAS
AWS Snowball at Edge - snowball appliance + EC2, S3, Lambda, some clustering (i.e. own little cloud, VM-host)
why? remote locations with bad internet (e.g. military), do local computation then return data to AWS
AWS Snowmobile - literal shipping container up to 100 PB of storage.
Whitepaper and re:Invent
Migration whitepaper:
Cultural change essential, requiring an Organisational Change Management (OCM) framework:
mobilise team & align leaders: to confirm sponsorship, secure resources/experts, coalition of leaders, create momentum
by: form a team to lead change, establish roles/milestones, shape program governance, align leadership roles
envision the future & engage the org: to articulate vision/roadmap for cloud transition, mobilise org/commitment, establish comms to maintain buy-in/support/participation
by: leaders communicate, business leaders reinforce (via process/tech/org), identify/asses impacted stakeholders, enlist/mobilise change champions, answer “how this impacts me?”, celebrate progress, control issues
enable capacity & make it stick: to ensure successful transition, align IT org/roles/process to cloud, ensure everyone can operate in new cloud world, ensure cloud benefits realised
by: identify change impacts and modify to IT org/roles/process, targeted training, setup measurement/evaluation of outcomes, correct when needed
Business value/reasons - #1 reason is agility (idea to implementation in minutes). Also: cost (elastic base), productivity (concentrate on business with managed services), cost avoidance (maintain hardware, hardware refresh), operational resilience (security, failures)
Migration strategy: picking the 6 “R”s (see above):
consider: time constraints (data center contracts), who will operate AWS (outsourced provider? in-house? desired long-term picture?), critical standards (e.g. minimum business continuity), automation requirements (to realise flexibility/speed, on which apps, how enforced?)
Business case:
answers: cost comparison? cost to migrate? ROI and payback period? non-cost benefits? how business agility improved?
contains: i) cost analysis, ii) cost of change, iii) labour productivity, iv) business value
cost analysis - TCO comparison, assessment of purchasing/pricing options (on-demand, spot, reserved, volume discounts), discounting (Enterprise Discount Program (EDP), service credits, Migration Acceleration Program incentives)
labour productivity - reduction in legacy activities (requisition hardware, installation, patching), automation benefits, developer productivity
business value (see above)
right-sizing (map workload to resources to avoid over-provisioning; turn off unused resources outside business hours)
tools: AWS Simple Monthly Calculator, AWS Total Cost of Ownership calculator (compare on-prem vs AWS cost specifically for business cases; based on #VMs and storage)
re:Invent:
types of business cases:
high-level - high-level data, establishing direction for cloud journey, estimate whether something is worth it
refined - more detail around people, confirms high-level business case. 80% accurate.
detailed - deep-dive, comprehensive impact, 95% accurate
technology optimisation - don’t want lift and shift (like for like) as not getting benefit. Compare cost of on-prem to future state re-architected AWS env.
Pillars of technology optimisation: i) leverage managed services (RDS) and high-level services (Lambda), ii) fit resource to workload (choice of instance type/families, storage types/classes), iii) price (purchase options, BYOL licensing or switch to MySQL), iv) scale (scheduled or react to event/alarm), v) iterate (periodically review, update, experiment)
hidden costs: training (ideally everyone to gain momentum, form CCoE); shifting governance (e.g. fit ITIL onto cloud), how will assets be managed (possibly existing config management will need update), who/how will provision (dedicated team vs self-service)
platform - likely needs modernising (can’t just rehost), but also things like issue tracker, monitoring, operating model will change
cost of inaction - costs that are going to be done regardless of on-prem or public cloud (application modernisation is a part of operating costs, refreshing hardware every 5 years, software licensing increasing (gets cheaper on AWS))
business value
quantifying tangible benefits: lower time spent supporting infrastructure -> more time to focus on business initiatives -> innovation. Examples:
cost avoidance - hardware refresh, forecasting uncertainty matching supply-demand (look at past utilisation), sunk costs (services/products purchased but unused because fixed contract)
resistance to turning off servers - give them IAM console access to start them, pick their own schedule
pass cost responsibility onto owners (need dashboards and reports to help this)
Migration Readiness and Planning (MRP) - tools/process/best practice to prep for cloud migration. Aligns with CAF.
assessment - consider: scope/business case well defined? evaluated env/apps via CAF lens? operations/employee skills reviewed/updated? have experience to do migration?
application discovery - ideally automated tooling to identify existing on-premise resources. Allows costing for business case
application portfolio analysis - after application discovery, group apps by pattern. Order 6 R’s migration strategy for each pattern. Data-driven. Examples:
most servers are Windows OS, some may require OS upgrade. Databases are 80% Oracle, 20% MySQL. Division by business unit or by environment (dev, prod). Score/priority based on: cost-savings potential, business importance, utilisation, migration complexity, 6 R’s
migration planning - review/define project management methodology/tools, define project charter/communication plan, project plan, risk log, responsibilities matrix, identify key resources/people for each migration work stream
technical planning - takes application portfolio and creates priority backlog of apps. Iterate to breakdown whole migration/learn
virtual private cloud environment
security - build it in. CAF security themes:
core themes: IAM, logging/monitoring, infrastructure security, data protection, incident response
augmenting themes (drive continuous excellence): resilience, compliance validation, secure CI/CD, configuration and vulnerability analysis, security big data analysis
operations - includes: service monitoring, performance monitoring, inventory management, release/change management, reporting/analytics, business continuity, IT service catalog
consider: need CCoE to centralise best practice, different apps need different operating models, managed services easies operations burden
platform - principles/patterns for implementing solutions and migration
includes: compute/network/storage/DB provisioning, systems/solution architecture, application development
key elements:
AWS landing zone: initial structure/config of AWS account/network/IAM/billing
account structure: setup initial multi-account structure for org
network structure: baseline network for most common patterns, baseline AWS-on-premise connectivity, user configurable options
pre=defined identity/billing frameworks: cross-account identity and access management (e.g. AD), central cost management reporting
predefined user-selectable packages: which integrate AWS-related logs into reporting tools, integrate with AWS Service Catalog, automation infrastructure.
consider: using managed service provider if you lack experience, identify account/billing structure upfront, will likely be hybrid cloud solution
Migrating
the first migration - recommend 3-5 apps, pick a common workload pattern within portfolio (e.g. re-hosting application via server replication tools, replatform database). Pick before MRP so can shape MRP for it
migration execution - after first migration, scale out to multiple parallel teams
Application Migration Process: six-step process
discover: extend portfolio discovery with Discover Business Information (DBI, owner, roadmap, runbooks) and Discover Technical Information (DTI, server stats, data flow, connectivity). Create plan and confirm with owner
design: develop/document target state (AWS architecture, app architecture, data flow, support/operational processes)
build: execute the design, some validate
integrate: connect to external service providers/consumes. Demo functionality
Portfolio Discovery & Planning: accelerate discovery, optimise application backlog. Work to eliminate objections/wasted effort
Migration Factory Teams: scale-out teams in parallel. 25-50% of portfolio use repeated patterns (unusual/complex portfolio items candidates for re-architect). Based on pattern:
re-host team: high-volume/low-complexity apps leveraging automation. Integrated into patch-and-release management
re-platform team: design and migration activity, repeatable change in app architecture
re-architect team: complex apps, possibly coordinate via app owner and added to roadmap
Migration readiness re:Invent 2017:
need:
executive sponsorship - formulate/evangelize strategy, comms plan, new roles/learning, prioritise migration (add or realign resources), define/measure success (KPIs, time between sprints, costs, build collaboration across org (conflict resolution)
business case - TCO. Refine it (which VMs are in scope, which will be retained/removed/re-platformed)
people - training (labs, certs), workshops with SA/Proserve. CCoE. Skills assessment. New job descriptions.
foundational experience - so can have intelligent/informed discussions on pro-cons
visibility - know state of IT portfolio, dependencies, licenses,
operating model - understanding what models exist now and why. Know what capabilities need to achieve vision.
Example models: traditional (dev team, infrastructure team + managed hosting, many teams); automated efficiency (devops, separate cloud engineering); devops (you build it, you own it)
ideally don’t have more operating models after cloud than before (i.e. want to simplify/standardise)
migration readiness assessment tool - detail areas based on CAF where weaknesses are before doing migration.
migration readiness and planning - approach migration in systematic way (sprints to improve readinesses - just like any other task), AWS tool to weight backlog
on-demand (JIT), pay-as-you-go (utility billing), no upfront
advantages:
trade capital for variable expenses
economies of scale
stop guessing capacity
more speed/agility
no cost to maintain data center
go global in mins (AWS has locations everywhere)
security: state-of-the-art surveillance, MFA, 24x7 guards, access on “least-privilege basis”
compliance: SOC 1/2/3, PCI DSS Level 1 (Payment Card Industry Data Security Standard, allow you to take credit cards, business processes), ISO 27001/9000, HIPAA (Health Insurance Portability and Accountability Act), MPAA (Motion Picture Association of America)
AWS secures the cloud and managed services (RDS, DynamoDB, Redshift), customer secures stuff (VPC, EC2, S3, etc) ontop of cloud
customer responsible for credentials of managed services,
recommends MFA, SSL/TLS, API/user activity logged in CloudTrail
storage decommissioning - DoD (NIST 800-88) process to sanitise disks (psychically destroyed)
network Security - connect to AWS over HTTP/S, for additional security, VPC + IPSec VPN to connect AWS to your datacenter
AWS separate from amazon.com (shopping)
prevention - DDos, MITM, IP spoofing (AWS controlled, host-based firewall to stop sending traffic with source IP or MAC other than own), port scanning, packet sniffing of other tenants
pen testing - must request vuln scan in advance, only your own instances
credentials - passwords (6-128 chars), MFA (6 digit code), access keys (signed request to AWS API, e.g. SDK. CLI), key pair (1024bit, SSH-2 RSA key, used for SSH EC2, CloudFront signed URL), X509 certs (used by: sign SOAP request to AWS API, SSL cert)
AWS trusted advisor - inspects env and recommends save money, improve performance, security holes
VMs - Xen (not hpyer-v, vmware) hypervisor (between hardware and VM) isolates instances running on same physical hardware, firewall within hypervisor separates ports/RAM
VMs no access to raw disk (it’s virtualized with proprietary tech), resets blocks between customers, RAM scrubbed to before allocated back to pool
Only customer has root access to guest instance, AWS has zero access
Firewall deny-all by default, EC2 security groups need to add ports on inbound
Encryption of EBS with AES-256, occurs on VM host between EC2 and EBS storage. To be efficient with low latency, only available on more power (M3, C3, R3, G2)
ELB - can SSL terminate on LB, identifies client IP whether HTTPS or TCP LB
AWS Risk & Compliance
AWS has strategic business plan including risk identification, to control/mitigate/manage risk, reviewed every 6 months
AWS scan their own endpoints for vulns (don’t scan customer instances), external vuln assessment by independent 3rd parties - only done for AWS (customer has to do their own)
auto-scaling, proactive scaling - traditional scale-up and scale-out laggy/expensive/difficult
design for parallelism to decrease run time by can spinning up more machines and get results faster, then scale down on completion
testability (mimic PROD, spin up more environments)
DR, design for failure - assume outages, hardware failures, more requests/demand than expected. Fix by:
automation, use pre-built AMIs, start services on (re)boot, sync from queues (no state on machine), avoid in-memory state use data store (S3, RDS)
in detail: use multiple AZs, CloudWatch for monitoring, cron snapshots of EBS
decouple components - loose coupling, i.e. SQS, separate web vs app server, batching using async
implement elasticity - i) proactive cycle scaling (periodic scaling at fixed interval, daily, weekly, monthly, e.g. tax time, pay day), ii) proactive (scale based on an event, e.g. sale), iii) auto-scaling (based on metrics, e.g. CloudWatch). Won’t get same hardware spec in cloud but compensate with scale (slave instances, replicas, sharding)
parallelism - ELB, multi-threaded apps/S3,
secure - web-tier on 80/443, SSH to app only, tiers can talk to each other
Pricing
Primarily based on: compute, storage, out data transfer out
No charge for data in and transfers within the same region
Well-Architected Framework whitepaper
general design principles:
stop guessing capacity requirements - you can scale
test systems at production scale - easy to scale
automate to allow for architectural experimentation - track changes, then rollback automatically
allow for evolutionary architectures
data-driven architectures - fact-based decisions from CloudWatch
improve through game days - e.g. simulate Black Friday sale
definitions
workload - set of components that deliver business value together
architecture - how components work together to achieve a workload
technology portfolio - collection of workloads required for a business to operate
5 pillars of:
security
apply security at all layers
enable tractability - logging to determine how attacks occurred
automate responses to security events - e.g. if someone is brute-forcing port 22 ssh logins then whitelist allowed connections
focus on securing your system - shared security model (what you are responsible for securing)
automate security best practices - e.g. use guides to harden OS and create AMI out of it for auto-scaling
definition of 4 areas:
data protection - classify data (e.g. public, organisation only, certain members within org), use least privilege access, encrypt at rest (S3, EBS, RDS) and transit (ELB)
best practices - encrypt your data, rotate keys, enable detailed logging on important files, use CloudTrail to audit changes to the cloud, data resilience S3 has 11 nines of durability, versioning on S3, AWS never moves data between regions unless customer enabled a feature
privilege management - authentication and authorisation via IAM, i.e. ACLs, role based access controls, password policy (rotation)
best practices - 2FA for root access, manage perms via IAM groups with different powers, limit automated access via roles
infrastructure protection - physical which AWS does (protect data centre, RFID, locked cabinets, CCTV).
best practices - you need VPC (public/private subnet with bastion or NAT, NACLs (subnet level, can deny), security groups (host-level, allow-only), user logins for EC2), antivirus
detective controls
best practices - CloudTrail (capture/analyze logs, per region setting), CloudWatch, AWS config, S3, glacier; Lambda to find access keys not used recently
reliability - recover from disruption, allocate additional resources to meet demand
design principles
test recovery procedures (e.g. Netflix simian army) which is hard to do on-prem
automatically recover from failure (i.e. terminate bad EC2 instance and launch new one)
scale horizontally - favour many small resources over fewer large resources
foundations - e.g. physical data link between HQ and datacenter is difficult to upgrade (do it right the first time). AWS is effectively limitless but service limits do exist (e.g. 5 VPCs per region per account)
questions - how are service limits managed? approval/change process, monitoring usage. How are you planning network topology? Do you have escalation path for issues? Vendor/Account-manager/contact, worth upgrading service level
services - IAM, VPC
change management (anticipate and monitor changes), e.g. auto-scaling and CloudWatch
services - CloudTrail
failure management - assume failures occur, determine how they happen so they can be responded to/prevented
best practices - how to backup? S3. How to withstand component failures? how to plan recovery?
services - CloudFormation, rds multi-AZ, auto-scaling groups
reliability pillar whitepaper:
limits: physical (max bandwidth of cable, latency, storage size), service (request limits; AWS is per region + account).
mitigate: need alerting/monitoring to know when about to hit these limits (Trusted Advisor, CloudWatch alarm); for service rate limits need SKDS with throttles or exponential backoff/retry
network:
VPC design (allow IP overhead, space for more than one VPC per region/account, more than one subnet for each AZ, use private IP space from RFC 1918)
consider: failures within topology, misconfiguration loses connectivity, unexpected traffic spike/DDoS.
achieving 99.999% (5 minutes)
most ISPs only provide 99.99% so will need redundant ISPs
5 minutes means automation investment and testing, auto-rollback/recovery on every dependent service
do you really need 5 9’s? Could failure be handled differently? Does entire architecture need this SLA or just parts of it (e.g prioritise writes over read or over control plane; prioritise business hours over weekends)? What are the most valuable/critical transactions?
decompose app into components and evaluate SLA for each so investment can be focused.
application design
fault isolation zone - use redundant/independent components in parallel; contain within a “cell” (shard, stripe), need to carefully design cell so actually independent but don’t introduce complex mapping logic (which pushes problems to mapping layer); EC2/EBS API can filter to a subnet only, don’t touch multiple isolation zones in one change (region by region, subnet by subnet)
redundancy
micro-services architecture - allows focused investment, service solves concise business problem that can be owned by a team, team can use any stack or use any lifecycle as long as API contract achieved
recovery-oriented computing (ROC) - systematic approach to improve recovery (isolation, redundancy, system-wide rode rollback, health monitoring, ability to provide diagnostics, ability to restart).
don’t be novel, just use small number of recovery paths (e.g. just terminate/replace instance) don’t want to use rarely used recovery methods (simply testing)
distributed systems - introduces latency, traffic. Strategies:
throttle requests (expect rejected requests to retry latter (perhaps exponential backoff), e.g. API Gateway; need load testing to determine limit)
fail fast (allow resources to be freed immediately)
use idempotency tokens (retries will happen, difficult to ensure exactly once processing)
constant work - if expect 100 unit/second but not enough work to achieve this, put in “filler” work to ensure you are actually getting 100 unit/second
circuit breaker - when dependency is down/latency issues, can return cached or local data to avoid propagating issues
bi-modal behaviour and static stability - failure causes more work/stress is bad. E.g. if AZ goes down, don’t try relaunching in same AZ, use health checks/ELB to shift to healthy nodes and launch new nodes async
operational considerations
automated deployment with patterns of canary, blue-green, feature toggles (deploy with feature disabled, turn on canary, and if fails turn off)
testing - also needs to be automated (runbook), constant canary testing that simulates user behaviour (1-second test every minute called externally), include dependency failures. Test brownouts (dropped packets, latency, DNS failure)
monitoring - percentile monitoring (exceptions will get worse at higher traffic levels). Cloud benefits to monitoring (abundance of generation, aggregation into S3/CloudWatch, real-time processing and alarms sent to SNS/SQS for action, storage analytics via S3 Select, Athena, EMR, Splunk, etc. Review retention requirements (manage policies in S3 lifecycles).
performance efficiency - use compute resource efficiently and maintain efficiency as demand/tech evolves
design principles
democratize advanced tech (consume advanced tech as a service, e.g. use DynamoDB vs having a no-SQL DBA, e.g. machine learning)
go global in minutes - e.g. CloudFormation to deploy close to customers
use serverless architecture - e.g. lambda
experiment often - deploy and compare, then tear down
best practices
compute - choose right server (CPU vs memory bound), AWS can change EC2 types, lambda (costs based on CPU usage, i.e. when used)
questions: how to select appropriate instance type? how do you ensure you continue to have the best instance type as tech evolves? how to monitor instances to check they perform as expected? how to check quantity of compute resources matches demand (auto-scaling)?
storage - “best” dependent on access method (block v file v object), pattern (random v sequential), throughput required, access frequency (instant/online v offline v archive), update frequency (static(worm) v dynamic), availability constraints, durability constraints
best practice - S3 (11 nines durability, 4 nines availability on standard but have IA, RR, Glacier, life cycles, versioning), EBS (has different media types, can move)
questions - how do you select appropriate storage (and as demand/tech evolves)? how to monitor performance? how to monitor capacity/throughput meets demand?
database - “best” depends on consistency requirements (read after write, eventual), availability, no-SQL, disaster recovery
best practice - RDS (lots of options), DynamoDB, Redshift
questions - how do you select appropriate DB (as demand/tech evolves)? monitor to check expected perf? how to check capacity/throughput meets demand?
space-time trade-off - RDS read replicas (multiple copies) to reduce time, direct connect to improve latency, global infrastructure (copy into other regions closest to customer), Elasticache, CloudFront
questions - how to select appropriate proximity/caching solution (as demand/tech evolves)? monitor to check expected perf? how to check capacity/throughput meets demand?
cost optimisation - lowest price to meet business objectives
trade capital for operational expenses (pay as you use)
cloud benefits from economies of scale
abandon data center prices (physical infrastructure, complex management)
definition
match supply-demand - auto-scaling, lambda
questions: make sure capacity matches but not substantially exceed demand? how to optimize usage of AWS services?
cost-effective resources - appropriate resource (e.g. use bigger EC2 instance to run faster)
questions: how to select appropriate resource types to meet cost target? how to select appropriate pricing model to meet cost target? are there higher-level services AWS provides (e.g. EC2 + DB vs RDS)
expenditure awareness - don’t need to quote/plan physical hardware
questions: what procedures exist to govern AWS costs? how to monitor usage/spending? CloudWatch Alarms (billing alert), SNS, Consolidated Billing. How to decommission/turn off unused resource (e.g. shutdown EC2 outside business hours), how to consider data transfer costs?
optimising over time - tech evolves (subscribe to AWS blog, use AWS Trusted Advisor)
questions: how do you consider/adopt new services over time?
operational excellence - practices/procedures to manage production workloads including how planned changes are executed, responses to operational events. Automate change execution and responses. Document
design principles
perform operations with code - documents infrastructure, can be automated
align operations process to business objectives - monitoring/metrics are business related
define KPIs based on business and customer objectives to determine workload success - e.g. response time for web API, number of purchases for e-commerce, watch-time for video
define metrics to track health of workload - e.g. CloudWatch
make small/regular changes
test responses to unexpected events (chaos monkey destroys instances, chaos snail slows instances)
learn from operational events/failures - capture/document
keep operation procedures current
definition
preparation - checklists (are we ready to move to prod?), workloads should have a runbook (operations guidance that ops teams refer to for normal daily tasks), playbooks (unexpected operational events, escalation paths, notification plans)
services: CloudFormation (documented, repeatable, tested in lower envs, reduce human error), auto-scaling (respond to business-related events), SQS (to handle load until service resumed), aws config (track/respond to changes in AWS workload), AWS service catalog (standardize), tagging
questions: what best practices are you using? how to manage config?
operation - standardize and automate, small/frequent changes without downtime and manual execution, regular QA, defined mechanism to track/audit/rollback/review changes, revert without downtime, centralize audit
questions: how to evolve workload while minimizing impact of changes? how to monitor workload to ensure operating as intended?
services: AWS CodeCommit (source control), CodeDeploy (auto deploy to EC2 or on-prem)), AWS SDK to automate operations, CloudTrail to audit
response - automated (not just alerting but corrective/rollback action), QA mechanisms to auto rollback failed deploys, responses follow a playbook, automate escalation (SNS)
questions: how to respond to unplanned operational event (e.g. RDS dies)? how escalation managed?
bad: requires more maturity, scaling storage performance
hybrid silo-pool: some clients in their own silo, others in pool
data migration: silo easier (tenet-by-tenet) but silo might not be desirable. Maintain backwards compatibility (avoid changing data representation) if possible.
security: IAM closer to silo/bridge, in pool authorisation shifts to application (must ingest user’s identity)
management/monitoring: need aggregation to see tenet-wide trends, but also need tenet-centric to pinpoint specific issues. Policies and alarm responses simplified with pool
developer experience: pool simplifies and allows abstraction of horizontal aspects
linked AWS accounts: for silo, but can still consolidate billing. But limits on how many linked accounts, harder to onboard
DynamoDB: silo/bridge similar but adds provisioning obstacles. Pool centralises monitoring/operations/scaling but requires indirection
silo: prefix tables with tenet, IAM role per tenet. Prefix annoying for applications
bridge: not great fit. Relax silo by removing tenet-specific IAM roles
pool: mapping tenet key to partition key creates partition hotspots. Can be avoided by additional layer of sharding and a lookup table (maps tenants to shards). Then data stored by our shard ID. Allows to scale by using more shards for a big customer.
auto-scaling IOPS: open-source Dynamic DynamoDB can scale based on a metric, or use on-demand pricing model
multiple environments (DEV, PROD) by prefixing table names, but does introduce extra complication to code (easiest in silo since already have prefix)
RDS: automating silo probably not difficult but efficiency of pooling appealing
silo: separate instances per tenant
bridge: same DB instance with prefixed table names (schemas can be tenant-specific and easier to migrate data in), or same DB instance with tenant-specific database (MSSQL)/schema (Oracle) inside.
pool: tenant ID in table. Schema must be identical (can’t be dynamic like DynamoDB)
RDS limits: e.g. size limitations, require sharding
Redshift:
limits 60 databases per cluster, 256 schemas per database, 500 concurrent connections, access to cluster means access to all databases
silo: cluster per tenant, achieve isolation with IAM, tuned individually
bridge: difficult to implement, size limitations, and can’t achieve security isolation
pool: like RDS, need careful connection management (caching), and isolation responsibility of application
can define user-defined scalar function (UDF) to be called from SQL which is defined in SQL or Python
Architecting For Scale
Desire for loosely coupled applications (atomic functional units is #1, insert abstraction layers, APIs, isolated units that can be scaled)
scaling:
horizontal: auto-scaling groups can do so elastically (cost-effective, redundant). Theoretically unlimited.
vs vertical: no easy way to automate it, have to script it. Hardware limits.
AWS Application auto-scaling - one common API for controlling scaling of non-EC2 instances, e.g. DynamoDB, ECS, EMR. Has all policies as EC2 auto-scaling except simple and manual(i.e. has scheduled, target tracking, step)
AWS auto-scaling console to manage auto-scaling at an account-level (e.g. check everything has auto-scaling policy, add it, change it) vs going through all screens to find options.
Helps with business policies: target lowest cost, high-availability, or balanced, or custom.
vs AWS EC2 auto-scaling: EC2 auto-scaling has step/schedule policies, or just care about health. Use AWS Auto-Scaling for predictive policy, simplicity
predictive scaling. Uses machine learning to either automatically scale or you interpret and configure it. See more in “EC2” section.
pets vs cattle - servers as disposable compute units, created/replaced/terminated with automation
Business Continuity
concepts
business continuity - minimise impact to business activity during unexpected events
disaster recovery - respond to event which threatens business continuity
high availability - design in redundancy to reduce chance of service level (agreed goal or target of performance/availability, e.g. SLA) being impacted
fault tolerance - design ability to absorb problems without affecting service levels, having correct state
recovery time objective (RTO) - time taken after disruption event to restore business back to service levels
recovery point objective (RPO) - acceptable amount of data loss measure in time (e.g. time between last backup and disaster)
usually defined by business continuity plan - justify level of investment
should include recovery granularity at various levels: file, volume, application, image
service availability - percentage of time application is operating normally, typically number of 9’s. Don’t exclude scheduled outage as customers want to use services during these times too (means need high availability).
99% (3d 15h) - batch processing, transfer, load
could achieve with manual trigger of full rebuild of environments
could achieve with ELBs (self-healing), multi-AZ RDS, automated deploy and rollback
99.99% (4h 22m) - online commerce, point of sale
could achieve with statically stable (provision enough capacity for 1 AZ failure) so at least 3 AZs, need to prioritise read over write (or some compromise), deployment includes testing (functional, performance, load, failure injection) using canary or blue-green into isolation zones sequentially, more vigorous outlier monitoring and periodic testing/benchmarking (including backup/recovery). Possibly multi-region here.
99.999% (5m) - ATM transactions, telecommunication systems
could achieve 99.995% with multi-region in hot-standby (replicate across region), automated failover to standby, need log aggregation
could achieve 99.999% with no-sql (difficult to get small enough RPO with RDBMS; no-SQL can partition and distribute data), active/active or multi-master across regions via Route 53 routing to nearest healthy region, need runbook for moving data between regions and addressing conflicting data, dev team owns production
dependency math:
hard dependencies: multiply, e.g. 99.99% x 99.99% = 99.97%
redundant components (e.g. multi-AZ): 100% - product of downtime; e.g. 100% - (100-99.99)*(100-99.99) = 99.9999%
which means S3 11 9’s is: to get 12 9’s = 99.9% x 4, 99.99% x 3, or 99.999% x 3
estimating unknowns: estimate = MTBF / (MTBF + MTTR), e.g. MTBF is 150d, MTTR is 1h, estimate = 99.97%
disaster categories
hardware failure - disk/network card failure
deployment failure - patch breaks business process
load induced - DDOS
data induced - receive unexpected input (64- vs 32-bit input)
credential expiration - SSL/TLS cert expiration
dependency failure - S3 being down cascades to other services
infrastructure - network outage, power outage
identifier exhaustion - no EC2 capacity
human error
disaster recovery architecture - tradeoffs between speed and cost
backup and restore - store data offsite with S3, storage gateway, snowball.
minimal effort, very common. But not flexible (just data, not processing)
examples: could have on-prem storage gateway and replicating AMI/EBS to AWS regions. Then on failure, use CloudFormation to restore EC2 instances from those AMI/EBS volumes (automate it as don’t want to do manual things at recovery time)
pilot light - primary workload on-prem with undersized cold standby in AWS (data replicated into AWS) which can serve traffic on route 53 update
cost effective, works for lots of workloads (if RTO allows). But failover not automated, could take minutes/hours to spin up. Have to keep standby in sync.
warm standby - pilot light but undersized instances are active (could be a shadow or staging environment)
can failover in seconds/minutes. But may still need to be scaled up to production size. Probably manual still.
multi-site ("hot site) - effectively 1:1 mirror the site elsewhere, ready to take full production load
failover in seconds, can be automated (with health check on route 53). But most expensive. Perceived to be wasteful
storage HA considerations
EBS volumes - target 99.999% available. Annual failure rate 0.2%. Replicated within a single AZ (so if AZ dies, screwed; use EBS snapshots in S3). Can be used in RAID.
S3 storage - eleven 9’s of durability. Availability: standard = 99.99% (52.6 min/year), IA = 99.9% (8.7 hour/year), 1-zone IA = 99.5% (1.8 day/year).
Storage Gateway - migrate on-prem to AWS as offsite backup. Continuously syncs.
Snowball - options based on volume, batch move only.
Glacier - archive
whitepaper:
hybrid AWS/on-premise: if on-prem has backup solution already could re-use it for AWS infrastructure (put agent on EC2 instances). Probably better to move master back server to AWS though.
compute HA considerations
keep AMIs up to date, copy to other regions for additional DR
horizontal scaling preferred as risk is amortised
self-healing design with ELB and health checks. Route 53 can health check too.
reserved instances is only way to guarantee capacity when needed.
database HA considerations
DynamoDB if possible over RDS (inherent fault tolerance), else Aurora (cloud-first DB), else multi-AZ RDS.
snapshot frequently (doesn’t impact performance in multi-AZ deployment; can be done on read replica for some DB vendors)
Redshift - has no multi-AZ currently, next best is to scale nodes horizontally. Snapshot in S3
elasticache HA considerations
Memcached has no replication and no snapshots. But can scale horizontally across AZs (limit impact of one AZ/machine dying). Large in-memory space.
Redis can schedule backups, scales horizontally with replicas (and multi-AZ) and shard across nodes within AZ (only one AZ is master).
network HA considerations
use multi-AZs (subnet each)
Direct Connect not HA by default so need secondary connection (ideally with different provider) or use VPN if <1Gbps
although VPN’s customer gateway and unmanaged VPNs (software VPN) not HA either so need two
Route 53 has health checks. Has 100% DNS resolution goal.
Elastic IP allows changing backend resource without impacting DNS resolution.
NAT Gateway needs to be in each subnet (AZ) with routes to use local one.
re:invent Netflix and complexity:
complex system as an unreasonable system (too big for one person to comprehend/reason) vs trying to make reasonable (i.e. trying to make simple). Lots of uncertainty.
economic complexity: defined number of states (software roadmap somewhat fixed), relationships (unbounded in software), environment, irreversibility (cloud helps break this)
accidental complexity (just happens/accumulates - difficult to refactor given other business pressures) vs essential complexity (built-in)
simplicity doesn’t mean availability - arguably need complexity to achieve availability.
chaos engineering - continually/daily introduce failures into production (complex system can’t be replicated in any other environment) to provide confidence of steady state availability (e.g. number of video starts per minute).
experiments (gaining knowledge/confidence vs testing which asserts a known property) to discover systemic weakness. Not concerned with how/why failure exists, just about finding it.
brings failure to forefront and enforces engineers design for it
in microservices, CHAP (chaos automation platform) clones a microservice to test into a control and experimental groups and routes a portion of production to them. Introduce failures (latency, outage, go to SLA) onto experiment and compare with control. When deviation detected, can end experiment.
of people - send them home for a day with no contact
intuition engineering - knowing/feeling when system is “off”. Vizceral WebGL visualisation of traffic, shows steady state, easy for humans to consume
re:Invent multi-region active-active - don’t do if you don’t need it (it’s hard)
using active-active avoid complaint of “wasted” DR/standby being idle, and configuration/data drift.
requirements:
reliable secure network, i.e. AWS backbone
data replication: synchronous (lag) or asynchronous (continuous still has lag or batch introducing eventual consistency)
want to avoid active-active at transaction level (two regions writing same DB rows), at system level OK (partition by customer/geography). Data conflicts hard to resolve.
what’s the minimum? Does everything need to be replicated? Can it be async (preferred)? Classify data and apply different rules
traffic segregation and management - different URLs for each region, DNS-level, traffic management (throttling, internal redirect, external redirect). Consider what happens if you log into “wrong” (define this) region? Just some delay?
monitoring - more things to monitor, metrics replication lag and code sync.
perhaps only replicate high-value user logs and KPIs (lower volume) to other regions
tools: S3 doesn’t have a replication lag metric but CloudFormation solution exists to build it (CloudTrail -> CloudWatch -> SNS -> SQS -> Lambda -> DynamoDB), Elasticsearch
multi-tenancy - unit of movement (move tenants between regions). Statically stable (each region can handle at least average workload during failure)
failover - needs to do traffic rerouting, change direction of replication
tolerance for network partitioning - region isolation (limit blast radius/dependency), no API calls between regions. “Split-brain” problem (network between region dies) so each region thinks the other failed, who’s correct?
distributed system best practices - idempotency, eventual consistency (rewriting apps to cater for this hard but required), static stability, exponential backoff, throttling, circuit breaking
designs:
simple - master in one-region with writes in other regions going here. Read-replicas in other regions (generally more reads will happen than writes). But violates regions as isolation zone, write lag
multi-tenant - each region has its own master and standby for its own customers with replication to other regions. Will lose some transactions during failover, but are in standby so can reconcile.
other tools: AWS Aurora multi-master, DMS replication (for granular replication), DynamoDB Global Tables.
parameter localisation - maintain app configuration outside app (so they’re stateless/immutable) with in-region local parameter store controlled by multi-region central configuration app; central code repo to ensure deploying same thing.
pro-tips
FMEA (Failure Mode Effects Analysis) - systematically examine: i) what could go wrong, ii) what impact it might have, iii) how likely it is, iv) what is our ability to detect and react.
can get really big. Perhaps evaluate across disaster categories.
risk priority number (RPN) - severity x probability x likelihood for not detecting
Example Architectures
MEAN stack - MongoDB, Express.js, Angular, and Node.js - everything is JS.
3 tier architecture on AWS
Architecture: Route 53, CloudFront, web tier of EC2 fleet behind ELB/ASG, to app tier behind ELB/ASG, to RDS master with multi-az stand by, read replicas.
Security: security groups between tiers, AWS WAF, DDOS protection via CloudFront or AWS Shield
Game architecture on AWS
Architecture: leverage stateless REST (easy to horizontally scale, errors are retriable), separate stateful servers (e.g. for chat, realtime multiplayer, matchmaking), binary objects in S3 with signed URLs to get/post
stateful: AWS Gamelift offers managed servers, but leveraging stateless REST as much as possible ideal (e.g. FPS is long-running stateful server, but stats, friends lists, and finding a matchmaking server can be REST)
push notifications: e.g. send invite between player (via SNS), or SNS mobile push notifications (not real-time though)
storage
no-SQL: more scalable than RDBMS. Friends/leaderboards/levels/items as sets/lists which fit no-SQL (not relationship data)
Redis for leaderboards, game lists, player counts, stats, inventories. Can provided transient, low-latency storage and then persist to MongoDB at some logical point (e.g. level-end)
DynamoDB: key-value (user data, friends, history), range key (sort leaderboards, scores, date-ordering), atomic counters. Should shard (e.g. different tables for leaderboards of different game modes)
S3: update directly to S3 with presigned URL. Store binary data in S3 and metadata into DB
analytics: collect metrics locally (say CSV format) and upload to S3, trigger S3 event notification to send to SQS and into EMR/Redshift/Kinesis/Athena
loosely coupled and async architecture: for things not needed in real time, split with SQS
Microservices Architecture
decompose monolithic apps into smaller services that use APIs, deployed independently
allows selection of best tech for the job, e.g. rather than one large RDBMS in on-prem world, pick what’s best fit
can really only be done in cloud (standing up entire vertical stack and elastic scaling difficult on-prem)
traditionally: split by technology (web, apps, persistent 3-tier architecture), but microservices split vertically:
presentation: use JS framework to present single-page using RESTful API. Static content via S3 and CloudFront (although co-located microservices will increase latency if CDN used).
microservices layer: serverless on DynamoDB or ECS Fargate. ECR for containers.
data layer: DynamoDB with DAX, RDS with Elasticache
reducing operating complexity: API Gateway for managing API, authentication; go serverless (no server to maintain); using SAM to deploy DynamoDB
distributed systems:
service discovery: figure out where microservices are, health checks, where to store config
DNS-based: ECS integrates with Route 53 for you (previously, had to do your own Lambda, Route 53, ECS Event Stream; or use ELB). Kubernetes has unified service discovery. See also “AWS Cloud Map”.
service meshes: adds layer to manage an application composed of hundreds/thousands of microservices. Typically: data plane (proxy which intercepts all traffic and decides where to route) and control plane (control proxies). Transparent to app. Standardises microservice discovery/communication. e.g. AWS App Mesh
roll your own on Lambda (collects resources via tags) and puts into Route 53 A-records in private zone
distributed data management:
microservices introduces transaction/atomic difficulties if one API call hits multiple microservices. Fix by using a saga pattern (maintain state machine and rollback each microservice on failure), which can be implemented with Step Functions. Helps if services idempotent.
event sourcing pattern - model changes as event not application state (e.g. DB transaction logs, source control). Allows state to be reconstructed at any point in time, natural audit trail. With microservices, can pub/sub events and consume by multiple microservices. Implement on Kinesis Data Stream.
often combined with CQRS pattern (Command and Query Responsibility Segregation). Separate write and read workloads. E.g. microservice has writes, but difficult to join across microservices, so use event sourcing to consume and store in read-specific structure/tech. Allows separate scaling of writes and reads, separate security. But is more complex, introduces message duplicates/failures, eventually consistent).
asynchronous communication and lightweight messaging
REST-based: stateless, use API Gateway
asynchronous messaging passing: using a queue to decouple. No service discovery required. Could use SNS to push to multiple SQS queues to have multiple consumers
orchestration and state management: don’t add orchestration logic to code. Step Functions coordinate microservices in a state machine
distributed monitoring: use CloudWatch to centralise
for containers: Prometheus/Grafana with EKS, or use FluentD agent to send to CloudWatch
distributed tracing using X-Ray
log analysis and visualisations: feed CloudWatch and S3 logs into Elasticsearch + Kibana; into Kinesis Data Firehose (if from CloudWatch, not needed if in S3) into Redshift and view with QuickSight.
chattiness: microservices creates network load, even lightweight REST could cause issues at high volume. Options:
consolidate services that frequently send data to each other
format: JSON/YAML or binary (e.g. Avro)
cache: with Elasticache or API Gateway caching
auditing
CloudTrail for auditing; CloudWatch Events defining a custom event + Cloud Trail to trigger Lambda for remediation.
resource inventory and change management: AWS Config
going further - a microservice could be composed of multiple stacks allowing each part of the microservice to pick the best service based on its pattern (read heavy vs write heavy; sync vs async; consistent vs spikey load)
launch image, in east1d (different), user data to sync (although cron should be doing this too)
auto-scaling group
delete EC2 instances
Global Application
routing - can’t ELB across regions, could use Route 53 latency or geolocation routing policy, cloudfront to cache
storage - RDS with multi-AZ deployment (does failover automatically, but can’t read this instance, it’s purely for disaster recovery), shard the data and requests yourself?