Friday, February 10, 2017

MongoDB shell - query collections with special characters

From time to time I found in MongoDB collections that have characters that get interpreted by the mongo shell in a different way and you can't use it as is.

Some example: If your collection name is Items:SubItems and you try to query as you would normally do


mongos> db.Items:SubItems.findOne()
2017-02-10T14:11:17.305+0000 E QUERY    SyntaxError: Unexpected token :

The 'fix' is to use a special javascript notation - so this will work
mongos> db['Items:SubItems'].stats()
{
... 
}

This is called 'Square bracket notation' in javascript.
See Property_accessors for more info.

Tuesday, December 6, 2016

Password recovery on Zabbix server UI

In case you need it ...

Obtain access to the database for read/write (for mysql this is what you need)

update zabbix.users set passwd=md5('mynewpassword') where alias='Admin';

Wednesday, November 16, 2016

Netcat HTTP server

Netcat is a very versatile program used for network communications - the place to find it is .

Often I need to test different programs with a dummy HTTP server, so using netcat for this is very easy.

Lt's say you want to respond with HTTP code 200 ... this is what you do with netcat into a shell


 nc -k  -lp 9000 -c 'echo "HTTP/1.1 200 OK\nContent-Length:0\nContent-Type: text/html; charset=utf-8"' -vvv -o session.txt

To explain the switches used:
  • -k accept multiple connections, won't stop netcat after first connection(default)
  • -l listen TCP on the all interfaces
  • -p the port number to bind
  • -c 'echo "HTTP/1.1 200 OK\nContent-Length:0\nContent-Type: text/html; charset=utf-8"' is the most interesting one ... this responds back to the client with a minimal http header and sets code 200 OK
  • -vvv verbosity level
  • -o session.txt netcat will write into this file all the input and output
Now you have a dummy http server running on port 9000 that will answer 200 OK ALL the time :)

Monday, March 28, 2016

Backups with Duplicity and Dropbox

Dropbox is a very popular service for file storage, the way the service works will synchronize by default
all your files across your devices. This is important to know since you will be backing up data into
Dropbox and you don't want to download the backups on every device you have connected.

What we want to do is to backup files, encrypt them and send them to Dropbox.
All this is achieved with Duplicity.

This is the setup

  • Linux OS, any distro will work I guess but I tried on Ubuntu 14.04 LTS
  • Dropbox account (going pro or business is recommended since backups will typical grow over 2GB basic account)

To encrypt files you will need GPG, in case you don't have a key on your system
we need to do a bit of work, if you do have a gpg key you can skip the next section.

GPG Setup

In this section will create GPG public key/private keys that will be used to encrypt the data you backup to Dropbox.


# install
$ sudo apt-get install gnupg
#
# check if you have any keys
#
$ gpg --list-keys
# if this is empty than you need to create a set of keys
# follow the wizard to create keys
#
$ gpg --gen-key
gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

gpg: keyring `/home/yourname/.gnupg/secring.gpg' created
Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 
Requested keysize is 2048 bits
Please specify how long the key should be valid.
         0 = key does not expire
        = key expires in n days
      w = key expires in n weeks
      m = key expires in n months
      y = key expires in n years
Key is valid for? (0) 
Key does not expire at all
Is this correct? (y/N) y

You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) "

Real name: Your Name
Email address: yourname@gmail.com
Comment: 
You selected this USER-ID:
    "Your Name "

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
You need a Passphrase to protect your secret key.

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.


....+++++
..+++++
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
+++++

gpg: checking the trustdb
....

#
#
# At this point the keys are created and saved into your keyring
# list keys
#
#
$ gpg --list-keys
/home/yourname/.gnupg/pubring.gpg
--------------------------------
pub   2048R/999B4B79 2016-03-26
            ^^^^^^^^ /used by duplicity
uid                  Your Name 
sub   2048R/99917D12 2016-03-26 

# Note 999B4B79 which is your keyid

Duplicity install

$ sudo apt-get install duplicity

After installation if you are on Ubuntu 14.04 LTS you will need to apply this patch
http://bazaar.launchpad.net/~ed.so/duplicity/fix.dpbx/revision/965#duplicity/backends/dpbxbackend.py
to /usr/lib/python2.7/dist-packages/duplicity/backends/dpbxbackend.py
If you don't know how to apply the patch is simpler to open the file at line 75 and write the following

 72 def command(login_required=True):
 73     """a decorator for handling authentication and exceptions"""
 74     def decorate(f):
 75         def wrapper(self, *args):
 76             from dropbox import rest  ## line to add
 77             if login_required and not self.sess.is_linked():
 78               log.FatalError("dpbx Cannot login: check your credentials",log.ErrorCode.dpbx_nologin)

Dropbox and duplicity setup

You need to have an account first. Open your browser and login.

Backups with duplicity and dropbox

Since this is the first time you run it need to make a authorization token, this is done as follow


$ duplicity --encrypt-key 999B4B79 full SOURCE dpbx:///
------------------------------------------------------------------------
url: https://www.dropbox.com/1/oauth/authorize?oauth_token=TOKEN_HERE
Please authorize in the browser. After you're done, press enter.

Now into your browser authorize the application. This will create an access token into dropbox.
You can see the apps you have going to Security
Should see under Apps linked backend for duplicity
In case you need to know what token is in use you can see it onto you system ~/.dropbox.token_store.txt


Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
GnuPG passphrase: 
Retype passphrase to confirm: 
--------------[ Backup Statistics ]--------------
StartTime 1459031263.59 (Sat Mar 26 18:27:43 2016)
EndTime 1459031263.73 (Sat Mar 26 18:27:43 2016)
ElapsedTime 0.14 (0.14 seconds)
SourceFiles 2
SourceFileSize 1732720 (1.65 MB)
NewFiles 2
NewFileSize 1732720 (1.65 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 2
RawDeltaSize 1728624 (1.65 MB)
TotalDestinationSizeChange 388658 (380 KB)
Errors 0
-------------------------------------------------

Backups

When the first full backup finished you can start making incremental backups, list the backups etc.
# list the backup files
duplicity --encrypt-key 999B4B79 list-current-files dpbx:///
#

## Make an incremental backup

duplicity --encrypt-key 999B4B79 incr SOURCE dpbx:///
.....
.....
.....

duplicity --encrypt-key 999B4B79 list-current-files dpbx:///

Troubleshooting

During a backup if you see something like

Attempt 1 failed. NameError: global name 'rest' is not defined
Attempt 2 failed. NameError: global name 'rest' is not defined

See the note about Ubuntu 14.04 because you need to patch the dpbxbackend.py file

Notes

If you use multiple computers and don't want to download from dropbox all
the backups you need to enable selective sync and exclude the Apps/duplicity
folder from Dropbox.
I haven't used duplicity for long time and heard some mix opinions, some say is excellent and some
say has some design flows (didn't checked) where your full backup will be taken after a while even if
you just do incremental. Remains to be seen.
I guess if this doesn't work well I would look into Borg Backup which seems to be the best these days since
has dedup built in and many other features. One thing that doesn't though is many backends as duplicity which
can use pretty much all cloud storage solutions around :).

Wednesday, January 13, 2016

Sublime Text X11 Forward - linux headless

On a newer editors (compared with Vim or Emacs) is Sublime Text.
Has many useful features and is quite popular these days, combined with the vintage_keys enabled (vim emulation) is
quite interesting.

This post shows what I did to have sublime text 3 working on a remote headless linux server, I used CentOS 7.1 installed with the group Base.

Since sublime text needs a display to run you will need to install a few packages.

sudo yum install gtk2
sudo yum install pango
sudo yum install gtk2-devel
sudo yum install dejavu-sans-fonts # or the font of your choice
sudo yum install xorg-x11-xauth

After all these packages are installed the ssh server (sshd for CentOS) needs to have the following settings.

# /etc/ssh/sshd_config

X11Forwarding yes
X11DisplayOffset 10
TCPKeepAlive yes
X11UseLocalhost yes
Restart sshd in case you changed your config file
sudo systemctl restart sshd

I used putty on a windows box so I had to make a small hack.

cd  $HOWE
touch .Xauthority  # empty file
Windows based
Configure putty to enable X11 Forwarding and connect to your server.
One more thing to mention is that if you use Windows than you will need to install a program Xming
After you download run the installer and start the Xming server.
Linux
You will need to run a X server - doesn't matter which one and have X11 forward it into the agent.
# when connect add the -X
ssh -X my_host_with_sublime_installed
# Or you enabled X11Forward into your .ssh/config
# something like this will do
Host *
   ForwardX11 yes


In case that sublime text is not installed, download from their site (is always nice to have a license too), extract
the files, typically you would have a directory called sublime_text_3.
# check first that the display is forward it
$ echo $DISPLAY
localhost:10.0
$ cd  sublime_text_3
$  ./sublime_text --wait
# 
At this point onto your local screen(display) you should see a window pop up with sublime text.

Saturday, August 22, 2015

Vagrant with libvirt(KVM) Ubuntu14

Vagrant doesn't have an official provider for libvirt but there is a plugin that allows to run via libvirt KVM on Linux.

First you would think why not VirtualBox/VmWare etc. - simply because KVM is built in and is very lightweight(especially if you run it on your laptop). Also if you have pre-made virtual machines with kvm you can easily package them as Vagrant boxes.

This is what you need to get started on Ubuntu 14.

Obtain package (could be a different version) wget https://dl.bintray.com/mitchellh/vagrant/vagrant_1.7.4_x86_64.deb
Install package

$ sudo dpkg -i vagrant_1.7.4_x86_64.deb

Install kvm, virt-manager, libvirt and ruby-dev

$ sudo apt-get install ruby-dev
$ sudo apt-get install kvm virt-manager
$ sudo apt-get install libvirt-dev

Remove just in case ruby-libvirt as we need a specific version

$ sudo apt-get remove ruby-libvirt
Instal from gem
$ sudo gem install ruby-libvirt -v '0.5.2'

Install the plugin

$ sudo vagrant plugin install vagrant-libvirt
_Note_: Installed the plugin 'vagrant-libvirt (0.0.30)'!

Thursday, December 18, 2014

Supervisor (python supervisord) email alerts

The program supervisor written in python is used to supervise long running processes. In case that a long running process will stop (crash) supervisor will detect it and will restart it, you will get entries into the log files however unless you have a log aggregation tool or you login into the server or have some other monitoring tool you will not know that your process has crashed.

However there is hope :) - you can setup an event listener into supervisor which can email you in case that a process has exit. To do so you will need to install a python package superlance This is how the setup is done.

# install superlance
$ sudo pip install superlance  # if you don't have pip install try easy_install 

# configure supervisor to send events to crashmail

$ sudo vim /etc/supervisor/supervisord.conf  # change according to your setup

[eventlistener:crashmail]
command=crashmail -a -m root@localhost
events=PROCESS_STATE_EXITED

$ sudo supervisor stop && sudo supervisor start
# done :)

In the example above if a process will crash (exit) an event will be sent to crashmail which in turn will email to root@localhost - of course you can change the email address, crashmail uses actually sendmail to send email (postfix and qmail come with a sendmail like program so no worries).
Also the email alert will be sent out for any program that crashed but if you want to filter out you can choose just the program you want by specifying -p program_name instead if -a, for more info you can see Crashmail section on the superlance docs.

Friday, November 21, 2014

Gitlab(Rails) gem loader error

I was trying to make a simple bash pre-receive hook into Gitlab and got one of this


# pre-receive hook
#!/bin/bash

`knife node show`

# Error
/usr/local/lib/ruby/gems/1.9.1/gems/bundler-1.3.5/lib/bundler/rubygems_integration.rb:214:in `block in replace_gem': chef is not part of the bundle. Add it to Gemfile. (Gem::LoadError)

Initially I thought I can change the hook to ruby and will fix it but after I tried all 6 ways to
execute a command according to http://tech.natemurray.com/2007/03/ruby-shell-commands.html and no luck I looked further into the gem specs for Rails and it looks like you can't load a gem that is not
declared into the Gemfile for your application.

So - what options you have really ? Install all gems and their dependencies into the Rails application Gemfile just to execute a
command ?! Well there is a different way sudo to the rescue :)


# pre-receive hook
#!/bin/bash

`sudo -u USER_THAT_RUNS_THE_APP knife node show`


# also you need to make sure into sudoers that the USER_THAT_RUNS_THE_APP has the right to execute without tty
Defaults:USER_THAT_RUNS_THE_APP !requiretty

Sunday, September 14, 2014

Vim - find occurrences in files.

Vim is the editor for anybody using the cli on daily bases. One useful feature it has is the find/grep into files. Obviously you can exit or suspend vim and do a find or grep however not many know that vim has this built in. You can simply use vimgrep and the likes - for more info http://vim.wikia.com/wiki/Find_in_files_within_Vim.

Tuesday, March 25, 2014

Vim setup for Chef(Opscode) Cookbooks

I've started programming seriously Chef cookbooks by a while but always felts is something missing ... Well I didn't have

  • jump to definition for any Chef dsl
  • auto completion
  • syntax highlight

Recently I found a solution for this, this is my vim setup(just as easy you can do it in Sublime Text as well) These are the tools in my setup

  • vim
  • vim-chef
  • ripper-tags (by my surprise ctags doesn't work well with ruby files ...)

To setup is as simple as

# vim with pathogen
$ git clone https://github.com/vadv/vim-chef ~/.vim/bundle/vim-chef
$ sudo /opt/chef/embedded/bin/gem install gem-ripper-tags
$ knife cookbook create test_coobook -o .
# create tags - there are better ways to do it - see gem-tags for example
$ ripper-tags -R /opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.10.4 -f tags
$ ctags -R -f tags_project
vim 
:set tags=tags,tags_project 
# done 

Sunday, March 2, 2014

Getting started with the new AWS tools

AWS replaced their java based tool with a neat python package for linux (didn't try the windows based ones yet ...). Why are these tools nice ?! - written in python - support from one tool for all services - wizard configuration To get started

# use virtualenv or global
# this example shows virtualenv

$ mkdir AWS
$ virtualenv AWS 
...
$ source AWS/bin/activate

# install the tools from pypi

$ pip install awscli
...
# configure

$ aws configure
AWS Access Key ID [None]: XXXXXX
AWS Secret Access Key [None]: XXXXXX
Default region name [None]: us-west-1
Default output format [None]: json

$ aws ec2 describe-regions
{
    "Regions": [
        {
            "Endpoint": "ec2.eu-west-1.amazonaws.com", 
            "RegionName": "eu-west-1"
        }, 
        {
            "Endpoint": "ec2.sa-east-1.amazonaws.com", 
            "RegionName": "sa-east-1"
        }, 
        {
            "Endpoint": "ec2.us-east-1.amazonaws.com", 
            "RegionName": "us-east-1"
        }, 
        {
            "Endpoint": "ec2.ap-northeast-1.amazonaws.com", 
            "RegionName": "ap-northeast-1"
        }, 
        {
            "Endpoint": "ec2.us-west-2.amazonaws.com", 
            "RegionName": "us-west-2"
        }, 
        {
            "Endpoint": "ec2.us-west-1.amazonaws.com", 
            "RegionName": "us-west-1"
        }, 
        {
            "Endpoint": "ec2.ap-southeast-1.amazonaws.com", 
            "RegionName": "ap-southeast-1"
        }, 
        {
            "Endpoint": "ec2.ap-southeast-2.amazonaws.com", 
            "RegionName": "ap-southeast-2"
        }
    ]
}

# Done!

For more info the project is hosted at github.com The reference table Aws tools references and the home page at aws.amazon.com/cli.

Wednesday, November 20, 2013

Javascript testing with Real Browsers


karma run --runner-port 9100
PhantomJS 1.4 (Linux): Executed 1 of 1 SUCCESS (0.397 secs / 0.071 secs)
Chrome 30.0 (Linux): Executed 1 of 1 SUCCESS (0.518 secs / 0.06 secs)
TOTAL: 2 SUCCESS

Sunday, October 20, 2013

Chef server internal error (11.08)

Tried the new version of chef-server 11.08 and looks like is broken. There is a bug into the jira CHEF-4339. I tried onto CentOS but looks like Ubuntu is broken as well (see bug description). How to see the error logs
$ chef-server-ctl tail

==> /var/log/chef-server/nginx/access.log <==
192.168.122.1 - - [20/Oct/2013:14:56:42 +0000]  "PUT /sandboxes/000000000000a38d5dd8e2763f913c6c HTTP/1.1" 500 "8.109" 36 "-" "Chef Knife/11.6.0 (ruby-1.9.3-p429; ohai-6.18.0; x86_64-linux; +http://opscode.com)" "127.0.0.1:8000" "500" "8.049" "11.6.0" "algorithm=sha1;version=1.0;" "chef-user" "2013-10-20T14:54:00Z" "oMRtV6loUDnbKJuGcW6nqBbF8ww=" 1029

==> /var/log/chef-server/erchef/current <==
2013-10-20_14:56:42.62140 
2013-10-20_14:56:42.62144 =ERROR REPORT==== 20-Oct-2013::14:56:42 ===
2013-10-20_14:56:42.62145 webmachine error: path="/sandboxes/000000000000a38d5dd8e2763f913c6c"
2013-10-20_14:56:42.62145 {error,
2013-10-20_14:56:42.62146     {throw,
2013-10-20_14:56:42.62146         {checksum_check_error,26},
2013-10-20_14:56:42.62146         [{chef_wm_named_sandbox,validate_checksums_uploaded,2,
2013-10-20_14:56:42.62147              [{file,"src/chef_wm_named_sandbox.erl"},{line,144}]},
2013-10-20_14:56:42.62147          {chef_wm_named_sandbox,from_json,2,
2013-10-20_14:56:42.62148              [{file,"src/chef_wm_named_sandbox.erl"},{line,99}]},
2013-10-20_14:56:42.62148          {webmachine_resource,resource_call,3,
2013-10-20_14:56:42.62148              [{file,"src/webmachine_resource.erl"},{line,166}]},
2013-10-20_14:56:42.62149          {webmachine_resource,do,3,
2013-10-20_14:56:42.62149              [{file,"src/webmachine_resource.erl"},{line,125}]},
2013-10-20_14:56:42.62150          {webmachine_decision_core,resource_call,1,
2013-10-20_14:56:42.62150              [{file,"src/webmachine_decision_core.erl"},{line,48}]},
2013-10-20_14:56:42.62150          {webmachine_decision_core,accept_helper,0,
2013-10-20_14:56:42.62151              [{file,"src/webmachine_decision_core.erl"},{line,583}]},
2013-10-20_14:56:42.62151          {webmachine_decision_core,decision,1,
2013-10-20_14:56:42.62151              [{file,"src/webmachine_decision_core.erl"},{line,489}]},
2013-10-20_14:56:42.62152          {webmachine_decision_core,handle_request,2,
2013-10-20_14:56:42.62153              [{file,"src/webmachine_decision_core.erl"},{line,33}]}]}}

==> /var/log/chef-server/erchef/erchef.log.1 <==
2013-10-20T14:56:42Z erchef@127.0.0.1 ERR req_id=rOkhxZcSowyaKaD+WsjFKg==; status=500; method=PUT; path=/sandboxes/000000000000a38d5dd8e2763f913c6c; user=chef-user; msg=[]; req_time=8043; rdbms_time=5; rdbms_count=2; s3_time=8028; s3_count=1


However the integration tests all pass ...

$ chef-server-ctl test

...

Sandboxes API Endpoint
  Sandboxes Endpoint, POST
    when creating a new sandbox
      should respond with 201 Created
  Sandboxes Endpoint, PUT
    when committing a sandbox after uploading files
      should respond with 200 OK
Deleting client pedant_admin_client ...
Deleting client pedant_client ...
Pedant did not create the user admin, and will not delete it
Deleting user pedant_non_admin_user ...
Deleting user knifey ...

Finished in 54.02 seconds
70 examples, 0 failures

Hopefully will be fixed soon.

Monday, October 14, 2013

Vim paste tricks

Vim is cool! But sometimes can be annoying - for example you edit your file etc. and have a short snippet of code you want to insert into it, so copy and paste BUT ... vim has the indent on. Now there are different types of indent and you can try to turned them off - see Indenting source code. A neat trick is the :paste option however there are cases where you want to turn off indented for different reasons. You can use something like this into your .vimrc
function! IndentPasteOff()
  set noai nocin nosi inde=
endfunction

function! IndentPasteOn()
  set ai cin si
endfunction

nmap _0  :call IndentPasteOff() 
nmap _1  :call IndentPasteOn() 
" paste on/off
set pastetoggle=
Now you don't want any indent type _0 and to indent again _1. Happy viming!

Wednesday, August 28, 2013

Why schematics is awesome

Schematics it's a python library that has primary use to validate json data.

Why is this awsome versus other validation tools like validictory or jsonschema ?

The workflow by design is based on the django/sqlalchemy, so you get back an object that has fields and each field can have it's own type, even complex types like objects that contain their own fields and so on.

One more thing that schematics has is - default values. This comes very very handy if you want your data to be normalized,

This is a simple example on how it works:

>>> from schematics import models
>>> from schematics import types
>>>
>>> class Client(models.Model):
>>>     name = types.StringType(required=True, min_length=1, max_length=255)
>>>     email = types.EmailType(required=True)
>>>     active = types.IntType(default=1)
>>>
>>> c = Client(raw_data={'name': 'John', 'email': 'john@example.com'})
>>> c.validate()
>>> c.serialize()
>>> {'active': 1, 'email': u'john@example.com', 'name': u'John'}

There are many other options to validate data - see Schematics.

Wednesday, February 27, 2013

A new era - Azure Cloud

It's official I started my first Windows Azure instance


$ ssh azureuser@kickrobot.cloudapp.net
The authenticity of host 'kickrobot.cloudapp.net (168.61.33.28)' can't be established.
RSA key fingerprint is 0a:aa:74:ec:6a:0d:13:de:1c:c7:e2:8c:e5:74:0b:cf.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'kickrobot.cloudapp.net,168.61.33.28' (RSA) to the list of known hosts.
azureuser@kickrobot.cloudapp.net's password: 

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

Welcome to Ubuntu 12.10 (GNU/Linux 3.5.0-21-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Thu Feb 28 01:04:05 UTC 2013

  System load:  0.04              Processes:           92
  Usage of /:   3.0% of 29.52GB   Users logged in:     0
  Memory usage: 16%               IP address for eth0: 10.74.234.17
  Swap usage:   0%

  Graph this data and manage this system at https://landscape.canonical.com/

45 packages can be updated.
26 updates are security updates.

Get cloud support with Ubuntu Advantage Cloud Guest
  http://www.ubuntu.com/business/services/cloud

Monday, February 18, 2013

Ansible within AWS (ec2)

Ansible is a new configuration/orchestration management framework and is just awesome!

Why is that ?

  • very short learning curve
  • no need for a specific data service language
  • can be used to both execute/configure machines
  • very simple to write your own modules
  • can be used into a push or pull model
  • ... ansible.cc ... for more info

This is how you can use it within aws(ec2) to manage services.

# Install ansible via git
$ cd /tmp
$ git clone https://github.com/ansible/ansible.git
$ cd ansible
$ python setup.py install
$ pip install boto # used for the ec2 inventory

# setup aws variables
$ export ANSIBLE_HOSTS=/tmp/ansible/plugins/inventory/ec2.py # ec2 inventory
$ export AWS_ACCESS_KEY_ID='YOUR_AWS_API_KEY'
$ export AWS_SECRET_ACCESS_KEY='YOUR_AWS_API_SECRET_KEY'

# setup ssh access
$ ssh-agent 
SSH_AUTH_SOCK=/tmp/ssh-dFUXvhH31724/agent.31724; export SSH_AUTH_SOCK;
SSH_AGENT_PID=31725; export SSH_AGENT_PID;
echo Agent pid 31725;
$ ssh-add /PATH_TO/YOUR_SSH_KEY_OR_PEM

# I use ec2-user onto a amazon linux
ansible -m ping all -u ec2-user
ec2-54-242-33-49.compute-1.amazonaws.com | success >> {
    "changed": false, 
    "ping": "pong"
}

The ec2.py inventory has connected to the aws api and obtained all the instances running within the account that has the exported credentials AWS SECRET/KEY. Then ansible used the ping module -m ping to ping the host(s). The ping module just connects via ssh to a host and reports pong with changed: false.

Now that we can connect let's see if we can leverage some of the metadata offered by AWS. My server runs into the security group ssh-web and to access this information from within ansible all you have to do is to use security_group_ssh-web. Where this come from is the ec2.py inventory script, if you run the script directly you will see something like this.

$ /tmp/ansible/plugins/inventory/ec2.py

{
  "i-e4c9ca9c": [
    "ec2-54-242-33-49.compute-1.amazonaws.com"
  ], 
  "key_mykey": [
    "ec2-54-242-33-49.compute-1.amazonaws.com"
  ], 
  "security_group_ssh-web": [
    "ec2-54-242-33-49.compute-1.amazonaws.com"
  ], 
  "tag_Name_srv01": [
    "ec2-54-242-33-49.compute-1.amazonaws.com"
  ], 
  "type_t1_micro": [
    "ec2-54-242-33-49.compute-1.amazonaws.com"
  ], 
  "us-east-1": [
    "ec2-54-242-33-49.compute-1.amazonaws.com"
  ], 
  "us-east-1b": [
    "ec2-54-242-33-49.compute-1.amazonaws.com"
  ]
}

In order to start the apache web server on all instances belonging to the ssh-web group is as simple as:

ansible -m service -a "name=httpd state=started"  security_group_ssh-web  -u ec2-user -s
ec2-54-242-33-49.compute-1.amazonaws.com | success >> {
    "changed": true, 
    "name": "httpd", 
    "state": "started"
}

# notice -s which stands for use sudo without password 
From here on sky is the limit, you can take a look at the docs site http://ansible.cc/docs/ for more complex examples.

Tuesday, January 29, 2013

MongoDB EC2 Deployment

MongoDB it's part of the NoSQL ecosystem and is presented as scalable, high-performance, auto sharding database. A full list of features that MongoDB offers can be found at mongodb.org.

In this post I'll explain how MongoDB can be deployed into the cloud - EC2 specifically in order to support a three layers architecture for a web application. Before I go into technical details and show how it can be done we'll have to understand why things are done that way, since these days there are many ways to achieve the same thing. Will wear different hats and start with being the architect. First lets' establish who is involved into the whole process to build a web application. Obviously there is somebody that will pay for the application - will call this entity - the business. The next entity will be developers who will actually write the code and the last(but not the least) will be the infrastructure people the admins (the architect belongs to this group). All these three entities can be combined into a single physical person or spread apart different i departments into an enterprise but the roles remain the same.

Now that we know who is involved let's see what does the business want. Well that's quite simple and usually spans from a single line as I want an application that is resilient to failure and always responsive to a full business case that has all the fine grained details. The more details the better but is not necessarily responding to the question I have in mind. My real question is how popular is going to be the app, this will give an estimate on what sort of traffic you will get, based on the traffic volume and hopefully pattern, with this you can determine quite a lot - from sizing the environment to how much it will cost to operate(remember in EC2 there is only OPEX cost). In my experience the business will not have any more input into this equation and that is just fine. So the next step will be to start looking at how developers think about it, would be a 'real time', how many reads and writes will be done from the web servers to the database, how all pieces will fall into place, etc.

Will start building on the following premises about the application

  • has to be resilient to failures (business's requirement)
  • has to be responsive all the time (business's requirement)
  • will need to support an initial high volume of users with the option to grow (business's requirement)
  • balanced as 70% reads and 30% writes for the database traffic(developers predict)
  • to be cost effective (this is always relative to the business's budget)
  • because of the above (cost) constraint the business agree that if a catastrophic failures happens into a region will be ok to have downtime but not ok to not be able to bring the site up somewhere else in a manner of a few hours.

At this point we have enough details to start putting all the pieces together. Will not mention the load balancer (piece no.1) and the application servers (piece no.2) into the three layers architecture web app. The focus is going to be the database (piece no.3) - MongoDB.

Starting at the bottom the smallest part is an individual server (instance), will explain how this can be made resilient to failure. Into ec2 the instances are into a flat network (not talking about VPC) and the storage is divided into two types.

  • ephemeral or instance storage - this is the disk space you have on the local hypervisor that hosts your instance and will be destroy just after you terminated the instance
  • network block storage - EBS volumes - which will be resilient to terminations of your instance

Obviously you will not store the database's data on the ephemeral storage so the only real option will be the EBS (network storage). The EBS come in two flavors these days - with provision IOPS and normal. The difference between the two, the first will have a guarantee performance and the second is best effort with a minimum (which is quite low) of performance. A consequence of this is that the provision IOPS is quite expensive compared with a normal EBS. Since we do have a constraint on cost will have to look into using the normal EBS but there is hope - you can group a number of volumes and use some of the Linux tools as LVM or Raid to stripe or mirror. So will attach 10 EBS volumes to each server that acts as a MongoDB database (you can use less than that of course but I'll do a Raid 10).

    
        # you will need to have your ebs volumes attached to the server
        mdadm --create --verbose /dev/md0 --level=10 --raid-devices=8 /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq
        # now create a file system
        mkfs.xfs /dev/md0
        #mount the drive
        mount /dev/md0 /mnt/mongo/data
        # I said 10 and there are 8 ...
        # The rest of the 2 volumes are used for Journaling - threfore if Journaling is enabled will not affect the data
    

With this done will look onto how MongoDB will offer resilience to failure. The solutions that are offered are Replica Sets and Sharding more to say partitioning at the database layer.

MongoDB Replica Sets - what is it and how it works. The idea is to have a set of database group together part of a replica set which will replicate between them the data asynchronous . Within the replica set there is a PRIMARY and a number of SECONDARY databases. Who is the PRIMARY is established by a process called voting. How a database server will become PRIMARY is based on different criteria, the important thing about it is that is the only database server that accepts write into the replica set. All other servers part of the replica set will just accept reads. An other very important factor is that a replica set is considered healthy only if it has 51% of the capacity available. That is from 3 servers you can loose only 1 - hence you just lost 33.3% and still have 66.6% up. If you have 4 servers and you loose 2, guess what ... well you just lost 50% of your capacity and the replica set is not healthy - remember you need to have 51% available. Knowing this is obvious that using odd numbers into a replica set is the preferred choice. There are workarounds for this where you can have servers not participating in voting, have arbiters (special servers that only participate in voting) and you can as well give more weight to a server into the voting process.

Why all this happens you may wonder - well there is the CAP theorem and MongoDB will choose CA from it - this is how 10Gen - makers of MongoDB decided to design the product and we have to live with it. So from the CAP MongoDB will choose CA - you can read why and how at Consistency and Availability at MongDB.

Now let's pause for a minute and see how the overall database cluster will look with a Replica Set:

1ZfNjpswFIWfhmUQxkBgmTCZdtNN06pSN5UTHLDGYGScCeTpC4MJcQxt1ML8SAkyx9jc+/lwDQYM0/ITR3nyhUWYGrYVlQZ8MGzbt6z62AjVjRBzErUSkMKRRLhQJMEYFSRXxT3LMrwXinZgVJ0sRzHWhO0eUV39QSKRyOBsr9c/YxIn3W2AF7Q9hai6OSJ8QEcqFi+S3XaXlhwQyOsrKdhO0E2dKSGcGUsVgeOCnNUwD0RNd8d4hLkiUZI9XXOCGwOGnDHRttIyxLRZmA56uwqPI70XPBxn4p4BMv1nRI8y9K84Jiyrte/bxWa1/WbIGa4JFieSUpRhORhzgcvRAPq0aq9hlmLBq4a4DKDzlfSZ7crzU7+8DpRacrW0l4FI4osvc/f51g2Z8ggvPX892QTlTTMt4+ZJMdGpMGPOjnlhomdEKNoRSkT168zuBGL/EYjjKzwGcPi2TgOACWj4/wYjQgLtUIFNHtU3Xx8IpSGjrPU59B7tB6++3boQnD3hriebhpbnmK7Ca+l2wrWBAp2YG0xgH/ju7AOWzl/9483kH+B8PAOBQK0/r+sf9935B97U4yH/DFTjafzjfTz/QOC9ZQWCYACZNfhb/awPi/oPVhrVOn8xRTWGNyyAp6NwgY6ie3/7LxRDe/kIgfVcBG7LyRAAx50JwNBuNAIgnAsABMs7PDAbAn0HSlkWs4UFZvO8uuBOMPD0e0Pvr1Okq+8fMl29cE5lcH/5hvnqG4TMF87n59db3/q0//J76bv6Noeb3w==

However having all instances into one single region doesn't look too good in case of a total failure on that region is always better to have an instance outside it - in this case I choose US-WEST - California.

1ZhRj6IwEMc/DY8a2gLio7ru3cu9nHfZ5F4uVSo2W6gpdUU//cFSlNLibgyYvUQJnUo7/19nplQHLZL8m8D73Q8eEeZAN8od9ORAGLpucS0Np5YhFjSqTEAZDjQimWaSnDNJ97pxw9OUbKRm23KmD7bHMTEMqw1mpvWFRnKnnIPB1f6d0HhXTwOCadWTyVM9RkS2+MDk6N0Eq+7cVSN5Vfuk2tCb1iOnmgdnzhPNIEhGz7qXW6rcUOjWXEREaD9hNH1tYkJLBy0E57K6S/IFYeW61MyrkZ47ei90BEnlZx5Q6t8wOyjXf5KY8rSw/V6NlrPVL6d2vgEwO9KE4ZSoh4mQJO904CqrCDXCEyLFqQSuePuKt/LHV7Mdr4sLJ8q2aywsrMMRK3rxZeir3OJGKe7AZco3te7wvrxN8rjMkzE+ZuNY8MM+G+M3TBleU0bl6e+Zf5IHvMkDgbFOxAIkhCYPAHrgEd6HI8ISr3FGxiIqJp9vKWMLzngV6Ch4hk9BMd08k4K/kron7YeXF2q0Jn6Nr8GrTuEmL3/aQ/igLxc+oNZ6iR4LjzAYKH6A9/8FEAj0fJsED4sfBCy4XOtn9qe4jIovmBlEC+2yj0yaaBwuXBogfGApxEEPIGx1uEP/fCj97UAo2rZa4g+EwEydhKcxH7lgsAVHra3Gm1oEB7bNtw/Bfpdgs2b0tsKwVRwfphaYaoffKm6/eXmhq9FArnWvsADpZ68IvuJe8QGyadBKGQQfuF3YXjc6quTi/hz6CMGkXTXAQ+ukGTaqbKDhJBuJ8rgqab6SNw9oxfVlWZ7R2onT3wHNqx+5dUJzbfLvOKEVzevZ972v8ecEWv4D

This is how we are doing so far in respect to initial requirements:

  • has to be resilient to failures - Replica Set will provide that
  • if a catastrophic failures happens into a region will be ok to have downtime but not ok to not be able to bring the site up somewhere else in a manner of a few hours. Having a member of the Replica Set into US-WEST will fulfill it.

Well how about the rest of the requirements ?!

From the infrastructure point of view there is only one more requirement that you can satisfy - the place to grow. This can be solved initially by vertically scaling the instance sizes but you can upscale up to the biggest instance ... and then what else you can do is to add what MongoDB calls Shards, basically you will have to partition the database. This may sound very scary but MongoDB makes it quite easy. It will automatically split the data based on a key(s). You can provide the key yourself or let MongoDB use the _id - this is a special object that will be provided and is called Object identifier for every document stored.

Alright ! - we are doing quite ok so far - from infrastructure point of view we solved all the problems that the business asked, but is not over yet - what about backups ? You can't run a site without it ! First let's pause and think about what the application will be doing in the first place with the actual setup. Based on the capabilities that MongoDB offers will write to the PRIMARY and will read from the SECONDARY ... well that means that if the application servers are hosted into region US-EAST than will have to go over the wire to read from the servers hosted in US-WEST ?!. Not a very good idea ... but there is hope, MongoDB has a an option when you create a Replica Set which says that a specific member of the replica will be hidden, meaning will replicate data but will not make itself available for any reads or writes.

With this in place we can actually have servers into a remote region, US-WEST that will just replicate the data but never been actually used by the application servers. Well this is the best candidate for backups from all other members!

The final infrastructure diagram including two Replica Sets, two Shards and the backups looks like this:

3Vtbb+I4FP41fWwUX3LhsfSy+zAjVe2uRtqXkSEuWA0xSkwL/fXrQAIkx1Ca2ikdiVZgJ3bOd26fj50Lcj1b/pWz+fSnTHh6gf1keUFuLjCOfV//LxtWrYZJLpJNE6oaFiLhRaNJSZkqMW82jmWW8bFqtD3JtDnYnE04aHgcsxS2/hKJmlatKBzsOv7mYjKt5olxuOko1KoeI+FPbJGqy3UT3nQv/U0frS5fVb/Jdtys8QBvUs4aDTkvxFvzIZ9E9RAVciOZJzxvXJKK7HkfJXJ7Qa5zKdXm22x5zdNSLTXkm5HuDvRuwcl5pk65oRL+haWL6tF/ymwib4a68TpdFEo/bhu94lXMUpbx6laeK748OP1OKG1nXM64ylcl2pveiFAPBxXg1Q1+5FUtrzv94vqq6Z5qSVjhyioEJ9sJdiLrL5XUByCDEECJp2xefp0tJ6WreOy18Ca5XMwLj70wkbKRSIVa/X6TJ6KCj6JCkNfEJPABIDGGeCBkAY+4GxwJU2zECu7liZ58+CTS9FqmcmPsJLzDN6GeblioXD7zuiezgxeNG2hFgcGA6ADiFQwsmA85O/NBiJxgP6Ej+0H0+xkQwoMWYr3aEEEGyHzj5+o//e9S/6ErgKqWX9nwpqgZj0MIRIAgEHWa/RQQplh8QP6hK/mhMZggoIZ8ZAUC6D6zMiNf+kh/XMl8Ssapk611iYMjEsPYYUvLiL4v8i4K2Ba6drJzYh3Ui5qIYDSAecMV76DfkHgEqM1eMaU9Jo563LMiH+FJZuSKftRe/K3oB8EAs17tqM7n50FAoFcRDC3IFQcJzoKDwGBswsAVCQngoqZKydglCTEkoNjgBa6ISHCQemGXRASjLxYb8i8dIPNEN7nTdRy02BcmJpFDV/wrgGxjK7QzVcc+KHbF8bapF7kJlLvfgh/129YeGIg3JiZj9y3wAwStHcprmzJtUD8cAqL2UsRUfYgNkNip2ITdIHFLmd7BrF4n15sTCODlrFoD4dqtV0n32GHFSKLQM1XKrSxZTcToqz2HEBhO+vMc2rH4+6WegwKQhnRi6nHReoxmOfQf0qYcATHl3shV5oW2MhVJwjPdNuOzEc+Li3qf0IH4ISDXKDRyj61jWI8fbnjmO3LXT36UZiJnZT6YK2zQzPdMnWJQl+pX7nqtsif3kI2fdR4oB85K+W8eust/nGPGPiinBBhyTG3onitnD2F59z4XM7Z+zEehuDPZEWmrPjK6eeSofFLXh/cll7K84ur+vhSe5y/rUHdojUGGU5kLTRAU03Pc+BZAoWC1aT5kgILYyJ6snDMIO9a3x3I2Xyju8TGGVOHublAey+lMFTbaOlyJi0EcwQC1wIBYYMOHOha3zwuw2LB76giwqCNBPy/AkKEc4Ayxjvz9zHwSGYmsM9DoHwFa3J+ZDTpGsoyrV5k/i2zi8ZQVSox/p5Ilv0dMZ8pxeZCwDSMOS1bXHcbjeRSHoGobmPJoEBlwjD6PI/JP8FeeTPhj9VPLJdTqQYOnhMxutz3re8oLOwJBMG4DgbBxMeWbSg/H6pjVnPdS6Ec5NmG8m7Aep5CLfMyrW3ewgtFo3B4N1YvyeijF8glXYKi1hrZ4nKi0E+KFRaXpcdcw7LveBxSJBruFcj+K3Jvw44r0G2NR2nogm2o8oUrch++FsOSHie8Rh0oLaTvuEZ929r4IFKJKm2uPZlNzJ5w16kdzhp1O7MVONQeMhfixF9rTHcVOdWc4TftDUwDdMqxIAFxKWyolYGj12DfHRuqqjoRO2ZtieS5fP22dCIiLB+4ME/kHZ/uoTUI9gaFsGuQHyhhsrEr6OSzNUIxZ+oONeHovC1HGFd0/kkqVL9FsL7hKxaTsUHJuwYBpe8cMRcZ9DoNGrbxYUh97+45Q4cB4/swVVBjWi78NVGSPkPSCFUwJD3xSyo79fx8vb68e/4HY2TvBsRVs1ZRzv7Zs2nCldoSHm0j7wv+67Vf4wLCvEMUHTo1/XH79c/cq4CZa717VJLf/Aw==

Now let's switch roles and have the developer hat on. You've been told about all this setup and you may think all you have to do is drop the code on the application servers and you are done, but wait a minute what will happen if a full replica set will be unavailable ? Well this is the part called design for failure.

Let's assume the application has a few a entities, one of them will be the User. In a typical SQL world you would have a table that has something like user_id, first_name and so on. Then all other tables will be linked to this table with Foreign Primary Key on user_id. If we would try to replicate this scenario the schema will look like this:

    

    /* users collection */
    > db.users.find().pretty()
    {
            "_id" : ObjectId("5107fb6736141503d37b6a31"),
            "username" : "johnd",
            "password" : "bc7a0154948baa69ecbe1d7843b25113fc5f3f20",
            "first_name" : "fname"
    }
    >
    /* objects collection - linking back to users via user_id */
    > db.objects.find().pretty()
    {
            "_id" : ObjectId("5107fc7536141503d37b6a32"),
            "user_id" : ObjectId("5107fb6736141503d37b6a31"),
            "data" : "all the goodies you need"
    }
    >

    

What can go wrong with this schema ?! Well, let's say we will shard on the users._id key, than MongoDB will split data based on the chunk size as need it and distribute it accordingly. On the objects collection we will have different options to shard, we can use the objects._id or objects.user_id etc. However if user X is located on shard01 and most of his entries (if not all) are located on shard02 than if shard02 is down the user will be without entries! If shard01 is down the user can't even use the system. So what will be a better approach? Locality of the data, have the user collection contain the objects collection. So this will look as:



    /* users collection embeds the objects collection */
    > db.users.find( {"username": "bobc"}).pretty()
    {
            "_id" : ObjectId("5107fe2936141503d37b6a33"),
            "username" : "bobc",
            "password" : "ad7a0154948baa69ecbe1d7843b25113fc5f3f20",
            "first_name" : "fname",
            "data" : {
                    "key" : "value"
            }
    }


With this schema if any of the shards are down than the user can NOT use the system but in case it can use the system his data will be consistent.

The process of choosing the 'right' sharding key is very tricky, has a few constraints from the MongoDB part as well it depends on your data structure and requirements. For more info see Shard keys for MongoDB.

Final review of the total requirements:

  • has to be resilient to failures
  • MongoDB ability to function into a Replica Set.
  • has to be responsive all the time
  • MongoDB Replica Set, write to Primary and read from Secondary. In the case you need more capacity there are two options, upscale of instance size and sharding.
  • will need to support an initial high volume of users with the option to grow
  • Again Sharding and Replica Sets will fulfill.
  • balanced as 70% reads and 30% writes for the database traffic
  • For the writes you have the option to add more shards, for reads you can add more servers to the existing Replica Sets, I choose three servers but nobody stops you from adding five for example.
  • to be cost effective (this is always relative to the business's budget)
  • Considering all other requirements having three members per Replica Set will be the minimum to have.
  • because of the above (cost) constraint the business agree that if a catastrophic failures happens into a region will be ok to have downtime but not ok to not be able to bring the site up somewhere else in a manner of a few hours.
  • In the case of total failure of Primary Site from region US-EAST you can still have the data in US-WEST. If your infrastructure is automated it will take no time to re-create all three architecture layers.
  • Backups of the data
  • The nodes from US-WEST are the perfect candidate for this. Use EBS snapshots and ship data to S3. With these snapshots you can recover even if all nodes (including US-WEST) are down.

Thursday, November 22, 2012

Puppet run_list (like Chef)

Chef has a very nice concept where whatever you need to run is part of a run_list. run_list. This provides an enforced way to run it as opposite to let control of the execution to the configuration management framework. Puppet lacks such run list - can use operators as -> but there is no global list. This trick can simulate a run_list with the same results. Create a file called test.pp with the content:
class one() {
    notice('class one')
}

class two() {
    notice('class two')
}

class three() {
    notice('class three')
}

define include_list  {
    $cls = $name
    notice("including $name")
    include $name
}

if $fqdn == 'debian.localdomain'{
    $clss = ['one', 'two', 'three']
    include_list { $clss:; }
}
And then run it as
 puppet apply test.pp
 notice: Scope(Include_list[one]): including one
notice: Scope(Class[One]): class one
notice: Scope(Include_list[two]): including two
notice: Scope(Class[Two]): class two
notice: Scope(Include_list[three]): including three
notice: Scope(Class[Three]): class three
notice: Finished catalog run in 0.04 seconds

Tuesday, October 30, 2012

Ec2 (aws) - delete snapshots

Ec2 snapshots are a way to make backups of your data into the amazon cloud. To do snapshots you will need the ec2-api-tools, your access key and secret or the x509 certificates for your aws account. Obviously after you snapshot you will need eventually to delete snapshots that you don't need anymore. This example shows how to use the ec2-api-tools into a shell to delete snapshots that are not part of the current month. You can have a cronjob that runs every last day of the month, this will give you almost 30 days of snapshots.
# describe snapshots and sort by date
ec2-describe-snapshots -C cert.pem  -K key.pem | sort -k 5

# delete all but current month (not the last 30 days)
ec2-describe-snapshots -C cert.pem  -K key.pem | grep -v $(date +%Y-%M-) |  awk '{print $2}' | xargs -n 1 -t ec2-delete-snapshot -K key.pem -C cert.pem