Thursday, 15 December 2016

Some AWS papers worth reading....

aws-building-fault-tolerant-applications_1428519022

https://drive.google.com/open?id=0B_FWc-VYFuxSejluQk1VOHJRLUU


aws-amazon-emr-best-practices_1427755536

https://drive.google.com/open?id=0B_FWc-VYFuxSQWF5TjE3eUpwenM

aws-cloud-architectures_1427123268

https://drive.google.com/open?id=0B_FWc-VYFuxSWEtCRnQtSkpuLUk

aws-cloud-best-practices_1427123288

https://drive.google.com/open?id=0B_FWc-VYFuxSeVl5VWVyaExGS28


aws-csa-2015-slides_1479830830

https://drive.google.com/open?id=0B_FWc-VYFuxSVWxFZDdxUVZHUm8


aws-disaster-recovery_1428162405

https://drive.google.com/open?id=0B_FWc-VYFuxSNzQ3R09ZSUhWS0k

aws-security-best-practices_1427123343

https://drive.google.com/open?id=0B_FWc-VYFuxSUDVNUEVJZ1hhOHM

aws-security-whitepaper_1427042776

https://drive.google.com/open?id=0B_FWc-VYFuxSMmdOb1lUQXpsVmc

aws-storage-options_1427123308

https://drive.google.com/open?id=0B_FWc-VYFuxSV0l6N0lvRzhId1U

exam-blueprint-new_1479416469

https://drive.google.com/open?id=0B_FWc-VYFuxSNzBYU0ZIUENNWDQ

intro-to-aws-security_1469227682

https://drive.google.com/open?id=0B_FWc-VYFuxSSjF1d0kzZ09ZMXM







Thursday, 8 December 2016

Some random things about AWS that cross my mind

Some things to consider while architecting an AWS environment :


  • Design instances with at least 2 or more AZs.
  • Purchase reserved instances in DR zones. This because you have a guaranteed instance capacity.
  • Use Route 53 to implement failover DNS techniques or Latency based routing.
  • Maintain backup strategies.
  • Keep AMIs up to date
  • Backup AMIs/ EBS snapshots to multiple regions
  • Design your system for failure (read Chaos Monkey) and have a good resilience testing in place.
  • Use Elastic IP addresses to failover to stand by instances when autoscaling and load balancing is not available.
  • Decouple application components using services like SQS.
  • Throw away broken instances.
  • Utilize bootstrapping to bring up new instances with minimal config.
  • Utilize CloudWatch to monitor infra changes/health.
  • Always enable RDS Multi-AZ and automated backups.
  • Create self healing applications.
  • Utilize MultiPartUpload for S3 uploads.
  • Cache static content using CloudFront.
  • Protect transit data by using HTTPS/SSL endpoints or connect to instances using Bastion hosts or VPN connections.
  • Protect data at rest using encrypted file systems or EBS/S3 encryption options.
  • Use AWS Key Management service/ Use centralized signon for on prem users and apply to EC2 instances/ IAM logins.
  • Never ever store API keys on an AMI. Instead use IAM roles on EC2 instances.

Use CloudWatch for shutting down inactive instances, There are basic monitoring checks that CloudWatch provides. Following needs custom scripts,
  • Disk Usage, Available Disk space.
  • Swap usage,Available swap.
  • Memory Usage; Available memory.


Use AWS Config service which provides a detailed configuration information about an environment. take point in time snapshots of all AWS resources to determine state of your environment. View historical configurations within your environment by viewing snapshots. Receive notification/ view relationships between different AWS resources.

Use AWS Cloudtrail for security/compliance/monitor all actions against the AWS account.

Know when to use S3, S3 RRS and Glacier service for storage.
Know when to run DBs on EC2 or when to use managed RDS DB providers.

Have Elasticity/Scalability :

  • Proactive Cycle/ Proactive Event based scaling/ Demand based auto scaling
  • Vertical scaling - adding more resources to your instances i.e. increasing their size/type
  • Horizontal scaling - adding more instances to existing ones.
  • Eg. DynamoDB has high availability/scalability. Use read replicas to offload heavy traffic Increase instance size. Use ElastiCache clusters for caching DB session information.

Data Security with AWS :

There is a shared responsibility model.
AWS is responsible for :

  • Facilities
  • Physical security of hardware
  • Network Infra
  • Virtualization infra
Customers (We) are responsible for
  • AMI
  • OS
  • Apps
  • Data in transit/rest/stores
  • Creds
  • Policies and configs
AWS provides us with network level firewalls. inside we have SG which provide another security layer across an EC2 instance level. You can also have OS level firewalls e.g. IPTables, FirewallD, Windows Firewall on individual EC2 instances. Can also use Antivirus S/W like trendMicro which integrates into AWS EC2 instances.

Remember : One requires permission to do any type of port scanning even in your own private cloud.



  • S3 has built in AES-256 bit encryption that encrypts data at REST. It is decrypted as it is sent to customer while downloading.
  • EBS encrypted volumes - Data is encrypted at EC2 instance and copied to EBS for storage. Any snapshot taken is automatically encrypted.
  • RDS encryption - MySQL, Oracle, PostegreSQL, MS SQL all support this feature. Encrypts the underlying storage space for that instance. Automated backups encrypted as well as snapshots. Read Replicas are encrypted as well. Provides SSL to encrypt a connection to a DB instance.

CloudHSM :

Hardware Security Module is a dedicated physical machine/appliance isolated in order to store security keys and other types of encryption keys used within an application. HSM have special security mechanisms like they are separated physically from other resources. Tamper resistant.

Not even "AWS engineers" have access to the keys on CloudHSM appliance.


AMIs can be shared between different user accounts but in the same region !

You cannot use Cnames in apex of a domain in Route53. Apex is just google.com. no subdomain.
But you can put cname of a load balancer in the subdomain like news.google.com.
Just select the LB name in Alias for Apex record.

As long as the instances belong to public subnet, the ELB will direct traffic to them in spite of the instances having no public IP assigned to it.
You can configure access logs in the ELB so that you can view the source IP of the traffic requests.




Disaster Recovery 


  • Recovery Time Objective (RTO) : Time it takes after disruption to restore operations back to its regular service level as defined by the company's operational level agreement. eg. If RTO is 3 hours, we have 3 hours to restore back to acceptable service level.
  • Recovery Point objective (RPO) :  acceptable amount of data loss measured in Time. eg. If system is dowan at 4PM and RPO is 2hours then system should recover all data as part of app as it was before 2PM.


  1. Pilot-Light : Minimal version of Prod environment is running on AWS. replication from on-prem to AWS is done in event of an disaster. DNS switch could be done.
  2. Warm- Standby : Larger than pilot-light. Running critical apps in standby mode in AWS. rest instances will be running in Poilot-Light mode.
  3. Multi-Site solution : Clone of Prod env. Just flip the switch. Its costly but you have almost less downtime in case of a disaster.


AWS Direct Connect

When we connect to out cloud network we usually connect the open network which includes network costs, network latency via public routing. All of this can be avoided by having a direct connection from your on-prem network to your cloud network.

Thus Direct connect will reduce bandwidth commitment, data transferred over direct connect is billed at lower rate, also dedicated private connections reduce latency rather than sending traffic via public routing.

So its like a dedicated private connection. We can also use multiple virtual interfaces.

AWS direct connect only allows to connect with private internal IP addresses. You cannot access public IP addresses.


We can use 2 active active or active passive direct connects for HA. You can also use Public Virtual interface to connect to any public service like S3, DynamoDB etc.

Direct connect provides only access to one specific region.


Amazon Kinesis

It is a developer service. A realtime data processing service to capture and store large amount of data to powerreal time streaming dashboards of incoming data streams.
Kinesis dashboards can be created using AWS SDKs. Kinesis also allows us to export data to other AWS services for storage. 
Multiple Kinesis apps can process the same incoming data streamed.
Kinesis syncs data across 3 data centers (AZs) in 1 region. The data is preserved for upto 24 hours.
Very highly scalable. 

Kinesis can be used in applications like real time gaming, real time analytics, application alerts, mobile data etc.

Kinesis workflow - 
  • Create stream
    • Build producers to input data
    • Consumers consume the stream : can be S3, EMR, Redshift
1 shard = 1 MB per sec

Amazon CloudFront

Its a global CDN which delivers contect from an origin location to edge location.
Origin can be S3 bucket or ELB CNAME that distributes requests among origin instances.
Signed URLs can be used to provide limited time access to private content. CloudFront can also work with Route53 for alternate CNAMES like cdn.xyz.com

CloudFront is designed for caching. It will serve the cached file stored on the edge until cache expires. This only if caching is enabled.

In order to create a new version of that object you will have to create a new object with new name or either invalidate that object. Remember : Invalidations made on CloudFront cost. First 100 invalidations are free.

You can also view additional info like what percentage of viewers are from mobile, desktop, tablet etc.

CDN has ability to set S3 bucket list permissions despite the bucket having no list permission policy !

You can also create separate behavior and restrict access to certain files like certain path pattern eg finance/* etc or a behavior where everything ending with *.xml should go to certain distribution etc


Route 53

Domain hosting service which can be used for failover.
Route 53 also provides latency based routing.
Also provides weighted record sets for routing.
Geo based record sets allows us to send requests coming from specific regions to a specific endpoint.


SNS (Simple Notification Service)

Used to receive notifications when events occur in the environment. 

SNS consists of - Topic (where message is sent to), Subscription end points (include JSON, SMS, HTTPS, SQS queue)

SQS (Simple Queue Service)

Highly available queue service. Allows creation of distributed/decoupled application components.
Each message can contain upto 256 KB of text in any format.
SQS guarantees message delivery but does not guarantee the order and duplicity of messages.
generally a worker instance will poll queue to retrieve waiting messages for processing.

I have worked on Azure queues in one of the firms where I worked. Really good service if you would like to have a distributed application.

There are 2 types of polling,
  • Long polling - allows the SQS service to wait untill a message is available in a queue before sending a response. It will stay connected for the defined amount of time.
  • Short polling - will not return all possible messages in a poll. Returns only a subset of servers. Thus increasing the api calls.
The difference between SQS and SWF is that SWF is task oriented.

Amazon EMR (Elastic Map Reduce)

EMR is a service which deploys out EC2 instances based off of the Hadoop big data software. EMR is used to analyze and process vast amounts of data. 
EMR can run any OpenSource application built on Hadoop.

EMR just launches pre built AMIs with Hadoop baked into it. It gives admin access to underlying OS. Integrates with AWS services like S3, DynamoDB etc. Bootstrappers enable the ability to pass config information before Hadoop starts when a new cluster is created.

Basically EMR is made up of Mappers and Reducers. Mappers split large data file for processing. Reducers take the results and combine it into a data file on the disk.

Processes files into chunks of 128MB/64MB.
Every EC2 size instances has their own Mappers and Reducers limit. e.g. one m1.small EC2 instance will have 2 mappers and 1 reducers.

Goal should be to use as many mappers as you can with out running out of memory for loading of data.

Master Node - Only provides info about data like where it is coming from and where it should go.
Core Node - Processes data and stores it on HDFS or S3.
Task Node - Processes data and sends back to Core node for storage either on HDFS or S3.

Amazon CloudFormation

Pure definition of Infra as code. We can create resources using JSON templates. 
You can create a new Stack or you can create a template from existing infra setup.
We can have 20 cloud formation templates. Also the template names must be unique in your environment.

Troubleshooting AWS

Ports are by default closed on SGs. You need to open the ports to connect to EC2 instances.
EBS volumes must be present in same region as EC2 instances.
Watch out for EC2 instance capacity.
EC2 needs public IP address and needs to be in a public subnet in order to connect to the internet.
In EC2-classic the Elastic IP address will be detached when instance is stopped.
AMIs are available only in the regions they are created. AMIs copied to new regions receive new AMI ids.

Make sure to auto-assign public IP address to EC2 instances while creating them so you don't have to assign them a public IP address manually.
you should add on-prem routes to Virtual Private Gateway route table in order to access extended resources.
When you add NAT instance you need to add 0.0.0.0/0 route to  i-xxx on the route table for private subnets.
Peering connections between VPCs can only be made between VPCs in the same region.
Make sure the proper ports are open in both NACL and SGs.
Only one internet gateway can be attached to a VPC.
Only one Virtual private gateway is needed on a VPC.
You can assign EC2 instance to multiple SGs.

You should enable cross load balancing.
Check proper status checks assigned with Load Balancer in order to see if instances are healthy or not.
ELB needs to have a SG with port 80 open.
Enable access logs for ELB to S3 bucket for better troubleshooting.
Subnet needs to be added to the EBL in order to add the instances for load balancing.


Sunday, 4 December 2016

Insights into Dockers and Containers !

Docker Inception

Installation : 
Docker requires 64 bit installation. First we can go ahead and create a Docker repository in the etc/yum.repos.d folder (be aware I am demonstrating this on a CentOS). Give the appropriate base URLs, gpgkeys etc
Do a sudo yum -y install docker-engine to install docker.
Then you need to enable docker using sudo systemctl enable docker and also start the docker daemon service.

Now that you have installed docker, you would have observed a thing that in order to run the Docker images command you have to be a root user (sudo). this is because you current user must not have been added to the docker group.This because we need to connect to the "docker.sock" file which is owned by root and belongs to the "docker" user group.


Like Git, we have repos for docker too. The common place to use docker images is from Docker Hub.

Go ahead n create an account on hub.docker.com. Its similar to Chef Marketplace. One place for public and private docker images.

Remember base images do not run, we need to create a container with all configurations. All base images are based upon a docker file.

Watch out for this space ! Much more incoming.

Sunday, 27 November 2016

AWS VPC !!!!!!

VPC tidbits 

Resembles our own datacenter like our own on premisis private corporate network.
Has private and public subnets and a scalable architecture. Ability to extend corporate/on-premise network to the cloud as if it was part of our network (using VPN).

Benefits of VPC :
  • Launch instances into subnet
  • Define custom CIDR(IP address range) inside each subnet.
  • Configure route tables between subnets
  • Configure internet gateways and attach them to subnets
  • Create a layered network of resources.
  • Security settings to protect cloud resources.
  • Expand network in cloud with VPN/tunnelling.
  • Layered Security
    • Instance security group
    • Network ACL at subnet level
Default VPC : Has an internet gateway attached Each instance has a default private and public IP address(defined on subnet settings).Public IP address are attached to an Elastic Network Interface that has a private IP address associated with it. This is called NATing. i.e. routing from public IP to private IP and eventually to the instance.

VPC peering : Direct network route between one VPC and another. Allows sharing of resources as if they were on same network. 
  • VPC peering can happen between other AWS accounts and other VPCs within the same region.
  • VPC peering cannot occur between two regions.
  • Scenarios -
    • Peering two VPCs - Multiple VPCs of a company linked under one private network.
    • Peering to a VPC - Multiple VPCs can connect with a central VPC but cannot communicate with each other. The only communication that can occur is between the peered VPC and the primary. Eg. If Third party wanted sharing of resource like a File share etc.
VPC Limits :
  • 5 VPC per region
  • 5 internet gateways (this is equal to per VPC as we can only have one internet gateway per VPC)
  • 50 customer gateways per region.
  • 50 VPN connections per region
  • 200 route tables per region/ 50 entries per route table
  • 5 elastic IP address
  • 500 security groups per VPC
  • 50 rules per security group (Remember security groups act at VPC level)


All instances launched into the default SG can communicate with each other.
By default each instance has a route to each other instance in the subnet which it is created in.

The difference between Public and Private subnet is having an internet gateway and not having an internet gateway.

In order to connect to the internet an instance must have an internet gateway and also a public IP.

For high availability you should have multiple instances in multiple AZs. And you can use Load Balancer to achieve the high availability.

Unless required, never expose instances to the internet by assigning it with public IP address.

Some takeaways :

  • Each subnet must be associated with a route table.
  • By default all subnets traffic is allowed to each other available subnet within our VPC which is called local route.
  • We cannot modify or delete local route.
  • Best practice is to leave the default route and create new routes as needed.
  • To enable access to/from internet for instances in a VPC subnet, we must attach internet gateway to VPC and also ensure that the subnet's route table should point to the internet gateway.
  • In order to download software and updates to a private instance one can use a NAT instance. 
  • NAT instance must be created in a public subnet.
  • NAT instance must be part of private subnet's route table.
AWS provides DNS servers for VPCs so each instance has a host name.

VPC Security

Security Groups

  • Operate at instance level
  • Supports allow rules only
  • Is stateful - so return traffic requests are allowed regardless of rules.
  • Evaluates all rules before deciding to allow traffic
Network ACL
  • Operates at subnet level
  • Stateless - so return traffic must be explicitly allowed through an outbound rule.
  • Processes rules in number order. So if a traffic is allowed at a higher number rule and denied at a lower number rule then the traffic will be denied.
  • Applies at network level - so one deny will block all traffic for all instances.
  • Deny all is the default rule for NACL
  • Example : in the below snap, the DENY on port 80 wont make any difference because all traffic is allowed at a lower level.

NAT instance - There are NAT specific AMIs present in AWS marketplace to create a NAT instance.
NAT instance must be inside public subnet and it must have a public/Elastic IP associated to it. Ports (HTTP/HTTPS) must be open. On NAT instance the source/destination check must be disabled ; This will allow traffic from our private subnet to the internet via NAT.

Extending VPC to on premises must always be controlled and encrypted. VPN allows one subnet from one geographic location to extend to another geographic location. This means it can extend to on-premisis network too. SO we would connect to all resources internally without need for public/private addresses.
VPG (Virtual Private Gateway) - It acts like a connector on the VPC side of the VPN connection.
VPG is conected to VPC. and the VPN is associated with the customer gateway creating the endpoint. A Customer Gateway acts as a connector on the on-premises side of VPN. This is where you configure the public IP address for the on-premise network.

Both VPG and the Customer Gateway are required to establish a VPN connection.

Only one VPG is attached to a VPC.

Alternate methods for deploying a VPN to AWS - Configure an OpenVPN instance which lives in a public subnet and we can connect to the public IP using the OpenVPN client.
But one must also consider high availability to this instance. What is this instance fails ? A single point of failure ? So High Availability must be applied to this instance.

VPC peering

You cannot do a VPC peering between two VPCs in different region. Both VPCs should reside in same region. VPCs are not AZ specific.

When peering you need to make sure that each VPC should have different CIDR ranges.
If one VPC is connected to second VPC, and second VPC is connected to third VPC then first and third VPC cannot communicate to each other unless explicitly connected.

You can create peering connections from VPC-VPC, VPC-Subnet, VPC-Instance too.

Limitations : You cannot just extend VPC connections from a Cloud network to an on-prem network. You still need a VPN and a VPGateway inorder to connect.
Even you cannot access an S3 end point which is present in second VPC from a first VPC (inspite of having a peering connection)












Amazon RDS

RDS Essentials

RDS is fully managed relational DB in cloud. Does not allow access to underlying OS. Can connect to DB server itself as normal ex. cmd line. Ability to provision/resize hardware on demand. Multi-AZ deployments. Read Replicas for reading efficiently (MySQL/PostgreSQL/Aurora).

Supported DB engines :

  • MySQL
  • PostgreSQL
  • Oracle
  • Microsoft SQL Server
  • Aurora
RDS Disk space Minimum - 5 GB and Maximum - 3 TB

SSD gives burstable performance. Provisioned IOPS gives the specified performance.

Benefits of running our own RDS instance instead of a Relational DB hosted on an EC2 instance,
  • Automatic minor updates
  • Automatic backups
  • No need to manage OS
  • Multi-AZ
  • Automatic recovery in case of failure
Automatic AZ fail-over - Multi AZ sync/replication of data to backup instances located in another availability zone in same region, In event of a failure AWS will automatically change the cname for that RDS instance to the standby/backup instance. In following cases the failover takes place,
  • Availability zone outage
  • Primary DB instance fails
  • Instance server type changed
  • Manual failover initiated
  • Updating software
  • Backups taken against standby instance to reduce I/O freezes and slow down if multi-AZ is enabled
RDS Backups : Automated point in time backups are provided by AWS. Eg. MySQL requires InnoDB for reliable backups.

Read Replicas : Replicas of original instance used for Read only
  • MySQL/postgreSQL/Aurora support
  • Uses native replication on by MySQL/PostgreSQL
  • Can be created from other read replicas
  • Read replicas are only supported by InnoDB MySQL storage engine.
  • Can offload tasks off of production DB
  • Can promote a read replica to a primary instance
  • MySQL
    • Replicate for importing/exporting to RDS
    • Can replicate across regions

When to use Read replicas ???

  • High non cached DB read traffic
  • Running business function such as data warehousing
  • Importing/Exporting data in RDS
  • Rebuilding indexes
    • Can promote Read Replica to primary
RDS integrates with CloudWatch and also many events can be notified.

As compared to EC2, here AWS has access to underlying OS it can manage many metrics like memory, swap usage etc. In case of EC2 AWS doesn't have access to OS, so we have to install the custom scripts for monitoring other factors.


RDS Subnet Groups

In order for high availability an RDS instance must be in different AZs. Here comes the part of Subnet Groups. We create an RDS instance, it automatically creates a standby in different AZ as part of Multi-AZ failover.

RDS Security Groups

RDS instances can share same SGs as that of EC2s or different. Same port rules n source rules apply to RDS SGs.



Thursday, 24 November 2016

AWS EC2 stuff !

EC2 Basics

Instance types -
  1. T2 - Burstable Performance Instances
  2. M3 - Nice Balance
  3. C4 - Compute Optimized
  4. R2 - Memory Optimized
  5. G2 - GPU Optimized
  6. I2 - Storage Optimized
  7. EBS Optimized
EC2 instance sizes also decide the network throughput capacity limitations. Like a Micro instance cannot have same network throughput as a Medium instance.

Instance Storages -
  1. Instance-store volume (ephemeral data) : Data would be erased if instance is stopped. But if instance is rebooted then the data is retained. 
  2. EBS backed volumes : Network attached storage. Remain persistant to life of instance. 1 instance can have multiple EBS volumes. 1 EBS volume can be attached to only 1 instance.
EBS have IOPS (input/output operations). IOP is 256KB or smaller. So a 512KB operation would count as 2 IOPS.

We can provision upto 20,000 IOPS on a EBS instance.

EBS types - 
  • General Purpose SSD : Commonly used as "root" volume.
    • Used on smaller instances.
    • 3 IOPS/GiB (burstable performance) - credits can be accrued in this case when the IOPS is not utilized.
    • Volume size of 1 GiB to 16 TiB
  • Provisioned IOPS : Used for Mission critical applications.
    • Large DB workloads.
    • Volume size of 4GiB to 16TiB
    • Can provision upto 20000 IOPS
  • Magnetic :
    • Low Storage cost
    • Used in cases where performance is not important/ Data infrequently accessed
    • Volume size of 1GiB to 1024 GiB
Prewarming of volumes - Sometimes we get already used volumes. AWS runs through erasing protocol when provisioning such volumes for us. That decreases the IOPS performance. At such times we can run some commands like Linux DD which will touch each block on the storage and pre warm it thereby giving us an improved performance.

EB2 Snapshots - Snapshots are incremental in nature. They store only changes since the most recent snapshot thus reducing costs and only paying for the incremental storages. Even if original snapshot is deleted, we have the newer snapshots with us. So all the data is there.

Snapshots are stored on S3 buckets. But we cannot go and directly list them.

Snapshots taken on EBS volumes will degrade the performance for that time.

Remember : When an EC2 instance in stopped, we are not paying for that instance. We are paying only for storage.

EC2 bootstrapping : Writing Bash scripts which will be executed during the instance provisioning.
This script is also called "User-Data/Cloud-init" and can be accessed from within the instance by going to http://169.254.169.254/latest/meta-data or http://169.254.169.254/latest/user-data

EC2-classic instances are not part of VPCs. They are assigned a public IP address and cname. Also receives private IP address but isnt part of VPC. Also the private IP is lost once the EC2-Classic instance is rebooted.

Security Groups are used as firewalls for EC2 instances. An instance can belong to multiple security groups. Security Groups can reference themselves as "source" traffic in firewall rules.

Action Item - Go and read the EC2 service limits on AWS console ! Its Important !!!

We can have 40 total instances out of which 20 is the max running instance limit.
EC2-VPC Elastic IPs has limit of 5 IPs per VPC.
Rules per VPC SG - 50
VPCs per env - 5
Security Groups per VPC - 100

The longer we purchase a reserved instance the lesser we pay per hour.

While creating EC2 instance, you can open SSH port with appropriate source in order to log into it :)

Note : Private IP addresses are persistance when EC2 instances are shutdown. But Public IP addresses are not persistent. They are lost once an instance is shut down.
There are two types of subnets in a VPC - Private and public.

An instance in the private/public subnet needs a public IP address and an Internet gateway attached to the subnet in order to connect to the internet.

Elastic IP addresses are always convenient because if any instance becomes unhealthy or unresponsive or needs to be removed them we can always detach the Elastic IP address and attach it to any other instance. For example you can have two NAT instances and can use one Elastic IP address which can act as a failover to second NAT instance.

Security Groups

We can have upto 500 security groups per VPC.
Each Security Group can have up to 50 rules.
We can assign 5 Security Groups per EC2 neteork interface
An instance can belong to multiple Security Groups.
Remember all Security Groups' rules have an DENY  by default. We cannot create deny rules.
Remember responses to outbound/inbound are stateful i.e. If inbound traffic at port 80 is open then outbound traffic over port 80 will be allowed even if port 80 is not open for outbound.
One can change SG (Security Group) of an instance but isnt same for EC2-Classic.

The "default" SG has all ports open. Instance which are part of default SG can communicate with each other in the default SG. i.e. there is an inbound rule which allows all traffic from the "default" source.

Bastion Host - A system identified as a critical strong point which can be used as a gateway to connect to other instances. One can ssh into private resources without using a VPN. A Bastion host may have additional security/additional software installed for further security tightening.



Monitoring on EC2

Types of Status Checks :
  • System Status Check - Loss of Network connectivity, loss of system power, physical host problems. Solution - Stopping and starting an instance will start it on different physical hardware device.
  • Instance Status Checks - Corrupted file system, failed system status checks,  Exhausted memory. Generally reboot or rebuilding the AMI solves the issue.
CloudWatch Alarms : By default CloudWatch will automatically monitor metrics that can be monitored on the host level. If we wanna look at memory utilization we have to use a perl script provided by amazon 
  • Basic level monitoring - Data is available automatically in 5 minute periods at no charge.
  • Detailed level monitoring - Data is available in 1 minute periods.
OS level metrics that require a third party script to be installed,
  • Memory utilization, memory used, memory available.
  • Disk swap utilization.
  • Disk space utilization, disk space used, disk space available.

You can also monitor CPU credits related to T2 micro instances which can acquire credits for underutilized CPU utilization.

EC2 Placement Groups

Placement groups - Cluster of instances within same availability zones. Have low latency 10 Gbps connection between them. Used for instances which require extremely low latency. Physically close they are placed.

If instance in placement group is stopped, AWS is gonna try to place it physically as close as possible to the placement group. Sometimes it can happen that later instances added will result in insufficient capacity error which can be resolved by stopping and starting all instances again.

Troubleshooting - Instances not originally placed into a placement group cannot be moved into placement group. Placement groups cannot be merged together. Placement groups cannot span multiple AZs. Placement groups name must be unique within your own AWS account. Placement groups can be connected. instances must have 10 gigabit network speeds in order to take advantage of placement groups.


Serving traffic to private web servers

An ELB should have at least 2 subnets.
Connection draining in ELB - Before a LB de-registers an unhealthy instance , it will wait for a certain amount of time before draining current connections.

Tuesday, 22 November 2016

AWS S3

S3 essentials

Static files. Set of name-key pairs. Buckets are grouping of information and have sub name spaces.
Busket names must be Unique across AWS.
S3 provides unlimited storage. 11 nines durability. 99.99% availability (Synced to multiple AZs)
Objects can be as small as 0 Bytes and as Large as 5TB.

Bucket limitations - 100 buckets per account at a time. Bucket ownership cannot be transferred once created. 

S3 security - All buckets/objects are private by default. ACLs for sharing S3 buckets with 2 different acounts. There are different bucket policies too(granting access to anonymous user, restricting access off of IP address, Restricting access based off HTTP referrer). Public content can be downloaded via URL. Signed URLs uses a key to create a unique URL; Can be done with API; time restrictions can be applied too.
S3 also has server side encryption.

S3 can be used to host static websites using route 53. S3 can act as an origin to the CloudFront CDN. Multi part upload facility available. Required for objects 5GB or larger. But suggested use is 100 MB and larger.

An uploaded object is synced across all AZs within that region.

Eventual consistency - If you upload an object and instantly make a read request, then your new object will be available.

S3 uses - Hosting static files, Origin for CloudFront CDN, Hosting static websites. Fileshare for networks, Backing up/Archiving (AWS Storage Gateway)

S3 Event Notification - events can be like RRSObjectLost (for recreation of lost RRS objects)
Events can be sent to SNS, LAmbda, SQS queue.

AWS IAM (Identity and Access Management)

IAM

Can federate with SAML providers auch as active directory for temporary and single sign on access.

MFA can be managed. Also IAM provides pre-built policy templates to users and groups.

Users & Groups

Groups - Assign permission policies to more than one user at a time.

Users - Best policy is to use IAM users. By default an explicit deny always overrides an allow. By default an user who has an explicit deny on all will need to have an allow policy to access something particular. Unless an allowed policy is applied, the user will not have access to anything.

Roles

Role can be created for EC2 instances or other accounts to assume some permissions. Can be temporarily granted too.


Best Practices for New Accounts

All resources in Amazon have a specific resource name. called ARN.
User can belong to multiple groups, multiple policies can be applied to a user. Now remember about the Explicit Deny :)

API Keys and Roles

EC2 instance can assume only 1 IAM role at a time. And the role can be applied only while creating the instance.

IAM Policies

We can create our own policies too. We can have policy versions too. One can also simulate policies.


IAM event logging with CloudTrail

CloudTrail service stores log into S3 bucket. We can also create SNS for each log file delivery.
One can create SNS topic. but after that one needs to create a subscription so that one can send notifications. Like one can create an Email subscription for a topic.

Sunday, 20 November 2016

AWS CSA Prep 3

Storage services overview 

S3 (Simple Storage Service) - It is an unlimited object storage (Key-Value). Can be used for static website hosting too (needs coupling with Route 53). Designed for 11 nines (99.999999999%) durability and 99.99 % availability. Charges are based per Gib storage as well as the data sent out of the region. Same region data transfer does not cost. The more storage used, the less it will cost.
S3 objects can be encrypted too and HTTPS end point security can also be applied.

Remember Bucket names should be unique across all AWS. U-N-I-Q-U-E !

RRS - Reduced redundancy storage  : has 99.99 % durability vs "eleven nines" for standard storage type. Can be used for the data which can be reproduced easily. Which can be recreated easily.

Lifecycle policies & object versioning - We can have unlimited versions if enabled on S3. We can archive\backup older versions into Amazon Glacier.

Now what is Amazon Glacier - Its an archival storage. Used for data not frequently accessed. Takes several hours to check in and check out. 0.1/gig per month.

Amazon Storage Gateway - This is the connection between local data center to cloud services like Amazon S3. Has two types,


  • Gateway-Cached Volumes - It will create a storage volume & mount as iSCSI device on premises. It will then store the data to Amazon S3 but will cache frequently accessed data on premises. To have faster access time.
  • Gateway-Stored Volumes - Stores all data locally and backs up the data as incremental backups on S3.


Amazon Import/Export - Users can physically snail mail large amount of data to Amazon. Amazon will upload it to cloud within one day of receiving the data.


Database services

Amazon RDS - Fully managed DB service from Amazon. No access to underlying OS allowed. Patching etc is all managed by AWS.

DBs supported by RDS -
  1. MySQL
  2. PostgreSQL
  3. Oracle
  4. MS SQL server
  5. Aurora
Aurora - Forked from MySQL by Amazon. Has 5 times better performance than MySQL with lower cost.

Amazon ElastiCache - In-memory cache for high performance DB queries. Caches results of queries etc. Application needs to be built to work with either Redis/Memcached.

Amazon DyanoDB - NoSQL fully managed service. Fully managed by AWS. Built as fault tolerant so in backend it syncs data in all AZs for the region. Easily integrates with Elastic MapReduce.

Amazon RedShift - Used as petabyte scale data warehousing service.


Analytics

Amazon Elastic MapReduce - Spins EC2 instances with Hadoop clusters. We have access to underlying OS.


App Services

Amazon SWF (Simple Work Flow) - used for work flow executions. Control panel to monitor task work flow. Scalable parallel EC2 processing. Service can be used with on-premises servers too. Guarantees execution of work flow.

Amazon SQS (Simple Queue Service) - Similar to Azure queues. Guarantees delivery of at least 1 message but not guarantees no duplicates. 

Amazon SNS (Simple Notification Service) - Co-ordinates delivery of messages to specific end points. Endpoints can be SQS, Email, SMS, HTTPS, Applicatoin etc.


Deployment Services

Amazon EB (Elastic Beanstalk) - Deploy complete app environment automatically. Support for docker containers.

Amazon CloudFormation - Allows us to code infra and deploy resources based off a pre-build template. Good for disaster recovery. We can even version control our AWS infrastructure.

Management Services

IAM - Identity Access Management - Manage permissions to AWS resources. Resource level\API call permissions can be managed.

CloudTrail - Logs all API calls made to AWS.

CloudWatch - Monitor services like EC2. Provides centralized logging for performance metrics etc. Is heavily used in AutoScaling.

Directory services - Can connect on-premise Microsoft Active Directory with AD connector. Also has ability to setup and operate new directory.

AWS CSA Prep 2

Compute & Networking :

EC2 – Elastic Compute Cloud provides VMs. Servers you call ’em. You can always customize them like processors, memory, volumes, n\w throughput. The pricing models for EC2 are,
  • Reserved instances – This option is used when you know that you wanna use those web servers 24*7 and would want those instances up and running given at any point in time.
  • On-Demand instance – Spin up when needed and then spin down when not needed.
  • Spot-Instances – These are non-production instances which are unreliable as AWS can take ’em back when they feel so. Has to be bidden.
Elastic Load Balancing – AWS’ load balancer to distribute traffic to different instances in different AZs. Used in Auto-scaling and Fault tolerance.
Route 53 – Doman management service. To direct traffic from domain to EC2 instances.
AMI – Amazon machine images. If you don’t know what Machine Images are, go home.
Instance Store-backed instance (Ephemeral storage) – Storage provided by EC2 instances. This is wiped out when the instance is turned off.
EBS backed instance – Block storage attached to EC2 instances. Used to backup snapshots. Can provision IOPS\ optimized EBS instances to help traffic between instance and EBS volume. Min size – 1 Gib Max size – 16 Tib.  EBS volumes cannot be attached to instances in different AZs. Can be attached to only one instance at a time. Point in time snapshots a can be taken.
User responsibilities regarding EC2 instances – Security groups, Firewalls, EBS encryption available only on M3 or larger. Any instance smaller than M3 should not use EBS encryption as its a resource hog. Applying SSL cert to ELB.
AWS responsibilities – DDOS protection, Port scanning is not allowed in our own environment too. Ingress network filtering.
VPC – Isolation of resources. Internal communication allowed between resources. But for inter-VPC communication peering should be done. VPC is free of cost, yay !
There is something called EC2-Classic which is a deprecated service by AWS. They dont belong to VPCs. They have been discontinued from Dec 2013.
Route 53 – DNS hosting solution. Register\Transfer domains. can be used to route http://www.xyz.comto a CloudFront distribution\ELB\EC2\RDS etc Route 53 also is used to manage internal DNS inside VPCs. Can be used for Latency\GEO\Failover routing.
architecture

AWS Certified Solutions Architect Prep Part 1

Okay so I will list down all the concepts that you will need to clear AWS CSA examination. I may be brief but I will just jolt down the points one needs to study. Here we go.
Some basic concepts :
AWS has different AZs known as availability zone which are geographically separated. They are data centers which are separated Geo wise. AZs can also help store data in the specific region. Not all AWS services have to be AZ specific like the IAM service.
CloudFront is a CDN by AWS. Caters with fast static file delivery. These files are stored on hundreds to AWS edge locations and can cater as per user affinity.
Scalability is cloud’s ability to expand\contract as per demand. Why would you pay for 10 servers when you are using 10 servers only at a peak time and like 4 servers at non-peak time. Thats where scalability comes into play. Elasticity is kind of same : scale up and scale down as per need as per work load. Now S3 is highly fault tolerant but EC2 is not. There are two types of scaling :
Proactive cycle scaling : scaling up at peak time.
Proactive event scaling : scaling up with anticipated demand.
Auto scaling : based on metrics like CPU\Network utilization.
Fault tolerance is ability of our system to recover\continue operating in event of failures.Fault tolerance can make use of auto-scaling, Route 53, AZ, Multiple regions. When designing fault tolerant operations, your aim should be to design an App which is designed with failures in mind.
There are many AWS services primarily grouped as,
Compute & Networking, Storange and Content Delivery, DB, Analytics, App services, Deployment services, Management services.

My Experiments With Cloud

Hey all, This would be my nth attempt at writing a blog. Most likely I drew inspiration from movies where stupid people like me write blogs to pacify the anxiety of life. Hell this ain't gonna be a life science blog, so without much ado lemme tell you what this blog will be about. It will be about my small experiments with recent technologies. Mind me, I am a software guy. And Cloud\DevOps has got my attention.
Even though I might sound like talking to a third person in my posts, the blog is only for my own personal convenience to remember highlights of the work I do. There might be random posts which you may find lurking around, dont blame me, you are reading at your own risk ;)

So here we go.