Random thoughts of a warped mind…

April 26, 2012

Chef recipe to setup up a new nodes FQDN, Hostname etc properly

Filed under: All,Amazon EC2,Chef,EC2,Ruby — Srinivas @ 15:19

How does an instance associate its hostname with a specific FQDN? And how to automate this for a Chef managed instance to set the FQDN properly…

If you’ve ever setup instances at Amazon EC2 (Not VPC!) , you would know that -

  1. Every instance has Two IP addresses – An internal RFC1918 address that is reachable only within AWS and an external IP address via which an instance can be accessed from the internet.
  2. Both IP addresses are valid for the lifetime of the instance i.e. if the instance is stopped and started, both the OLD internal and external IP addresses are gone and new ones will be alloted to them.
  3. To ensure that the external addresses remain the same, you would normally request and associate an Elastic IP address with the instance – When an instance is stopped, this mapping is lost and must be re-associated with the instance when the instance is started.
  4. The internal IP address of the instance is never to be taken for granted (VPC is an exception)
  5. See Instance addressing for details.

Now coming back to Ubuntu, this is how an instance associates its hostname with a specific FQDN:

  • File /etc/hostname contains the hostname of the system which is typically like “somehost.mydomain.com” – While you could always run “hostname somehost.mydomain.com” to set the hostname, this is not persistent across restarts while updating /etc/hostname is persistent.
  • File /etc/hosts contains IP to hostname/short name entries. Based on standard configuration in /etc/nssswitch.conf, this file is almost always read before DNS lookups (or NIS hostname lookups).
  • Based on the hostname found from gethostname() system call, the system looks up the IP address matching this hostname.
  • Whats gets looked for mapping the hostname to the IP? Depends – If you have “hosts: files dns” in /etc/nssswitch.conf, then /etc/hosts is consulted first. If /etc/hosts does’nt have an entry , then dns is looked up and doing a reverse lookup on an EC2 internal address will return its EC2 hostname and not your systems name e..g if your AWS EC2 IP is then ip-10-245-81-136.ec2.internal would be returned by hostname –fqdn.
  • If the hostname of the system (as located by the gethostname() call) is in the hosts file (e.g. myname.mydomain.com) then myname.mydomain.com would be returned.

The FQDN cannot be “set” as per the hostname man pages…

FQDN section of "man hostname"

This also affects the FQDN displayed in the Chef node status panel (of the admin GUI) or else the output of the “knife status” command. Considering that instance IP addresses are bound to change and the FQDNs based on this, I like to ensure a little consistency. The simplest way to do this on a stand alone host is to add an entry for “myname.mydomain.com myname” into /etc/hosts (Way too many people add the real hostname to the localhost line, I dont like that!). If you stop and start instances several times(I do – I dont want to pay AWS big bucks :-) ), then doing this each time is a hassle and hence this chef recipe. If you dont use Chef to provision your nodes, you could very well just drop in the updates_hosts.sh script alone as /etc/network/if-up.d/update_hosts (with permissions 555) and make sure you set your hostname properly (Just once at install time) and your FQDN will always be proper.

If you want Chef to set this stuff up for you, then here is the ruby recipe (most of it):

Drop the update_hosts.sh file into your cookbooks files/default section and then invoke the ruby code as part of your recipe.

Now when chef-client runs as part of instantiating a new node, its hostname will be set to the FQDN you provided via Knifes “-N” option and the script update_hosts will be setup to update /etc/hosts to always set the IP address of your hostname (in the file) to be its current IP address. This is tested on Ubuntu and Debian system, however changing this to work with Redhat/Fedora based distros should be trivial – the idea is the same.

  • http://www.facebook.com/profile.php?id=724773025 Alan Hannan

    This is a helpful suggestion and something I have used many times. One challenge I see is that using ‘knife ssh’ wants to send you to the chef-listed IP address of the server. If you set the IP address in /etc/hosts to the local IP Address (RFC1918) then ‘knife ssh’ fails. Can’t have it all I guess? What do you think about setting the ip address to the public IP?

  • srinivasmohan

    @Alan, All my chef managed instances are “stub” nodes in the sense that
    they do not manage other instances. All of my admin work happens from my
    laptop or a management instance at EC2. All of my knife-ssh commands
    run on a role/group of servers and I have every instance configured with
    a system generated hostname which also become the Chef FQDN for that
    managed node. There is a corresponding DNS entry added into my public
    dns via another chef recipe for this. My ifup event also invokes a dns
    update for that hostname. e.g. nodexxx.onepwr.org would become a CNAME
    for ec2-184-71-190-168.compute-1.amazonaws.com. And the ec2-XXX hostname
    resolves to the public facing internet address from outside and the
    internal EC2 private IP address from inside EC2.

    Knife ssh typically returns the chef fqdn of the node so even if you
    did’nt have dns entries per instance, you could always put together a
    quick script to update your local /etc/hosts on your mgmt instance/work
    machine to update hosts file to have the public or private address of
    your instance (as required). See my gists https://gist.github.com/4322269 and https://gist.github.com/4322280 for more info on what I mean…

  • Ankit

    Where are you defining the resource “Configure Hostname”?

  • Srinivas

    Ankit, just define config host name to be a bash block doing “hostname < /etc/hostname". That was trivial enough not to include IMHO…

  • Sal

    Srinivas, can you provide an example of where that bash block would go or where to define the Configure Hostname resource?

  • Msciwoj

    This only solves the problem “within” EC2 server in question (EC2A)
    If you have other EC2 server (EC2B) that needs to connect to EC2A, it still can’t use “myname.mydomain.com myname” for internal (inside Amazon network) communication (assuming same region), right?

  • http://www.onepwr.org/about Srinivasan Mohan

    Correct. Every ec2 host gets a public hostname which maps to its “current” public IP (whether elasticip or not). Now the way Route53 does it host/ip resolution is based on partitioning. When you do a nslookup on a ec2 host from outside EC2, it will return its public/internet IP address. However when you run nslookup from an EC2 machine in the same region, it will return the internal/private IP of the instance. So if you alias myname.mydomain.com on your dns to the ec2 public hostname (e.g. ec2-a-b-c-d.compute-1.amazonaws.com), then it will resolve to its internal/public IP based on where the nslookup was run from… Its pretty trivial to stick in a dns name setup as part of nodes chef/puppet setup to associate its custom hostname (*.mydomain.com) with its ec2 public hostname (e.g. if your dns is route53, then zone.records.create({ :name => “myname.domain.com”,:value => “ec2-a-b-c-d.amazonaws.com”,:type => “CNAME”,:ttl => 1800 }))… See fog gem docs for details.(or as applicable to whatever dns you are using).

  • http://www.onepwr.org/about Srinivasan Mohan

    Sal – Config hostname is a simple bash block in the same recipe to just run hostname. Figured THTOWTDI :-)

    execute “Configure Hostname” do
    command “hostname –file /etc/hostname”
    action :nothing

  • Msciwoj

    heh, “trivial” maybe for you ;)
    If I get you right I would need some DNS server so that:
    - each node would register itself first, providing the translation of its custom hostname to its ec2 dns name
    - each subsequent intra (EC2-to-EC2) request would use full custom hostname of the target EC2 node, this would be routed to DNS first, which would give back its original, registered EC2 node dns name?

    If that’s the case, I would still have few questions:
    - you say the DNS name to register against is to be public? If so, how it ensures the traffic is within internal AWS network only? Shouldn’t it be private DNS name though?
    - you give route 53 service as an example. Could it be some other, non-amazon and free DNS web service used instead? any suggestions?
    - I’m new to Linux world. I like to first understand how this works and potentially later use tools that automate it. You mention chef/puppet (some config tools?) to perform the DNS registration. Could the same be done using bash/perl scripts? Would you point to some examples?

  • http://www.onepwr.org/about Srinivasan Mohan

    lol :) Correct. You would need a dns service that you can push updates to via some sort of a script – I’ve used dynect and route53 which have pretty good apis for that. If you use bind on your own box(es) for the public DNS, you could implement a simple wrapper on top of it to do this as well (e.g. DLZ plugin for bind).

    “you say the DNS name to register against is to be public? If so, how it ensures the traffic is within internal AWS network only? Shouldn’t it be private DNS name though?”

    Instances hostname => Obviously available from running hostname :)
    Instances EC2 public name => Available from “curl
    (See http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AESDG-chapter-instancedata.html for details to get a whole lot of instance metadata from AWS. Even if you dont want to use ruby, I would recommend installing ohai gem which can get you lot of this info painlessly)

    Now that you have the hostname (e.g. *.mydomain.com) and the public hostname, make a request to your dns api to add a CNAME record that aliases myhost.mydomain.com to the actual value . Recommended to set a low TTL if you start/stop instances a lot and want DNS lookup for myhost.mydomain.com quickly.

    Like I said before, Route53 (which is used to resolve all ec2-***.amazonaws.com names) returns responses based on the dns clients network i.e. If I resolve ec2-54-212-249-208.us-west-2.compute.amazonaws.com from:
    => Within EC2:
    [email protected]:~/parallel-s3sync# host ec2-54-211-219-218.us-west-2.compute.amazonaws.com
    ec2-54-211-219-218.us-west-2.compute.amazonaws.com has address
    (Caveat: The internal IP will be returned since mgmt01 is also in the same AWS datacenter i.e. Nslookup from a different AWS datacenter will return the global/public IP and not internal. As long as the lookup is made from the same region as the instance hostname being looked up, you will get the internal IP)

    =>From somewhere outside EC2, e.g. my laptop on some public wifi:
    [[email protected] ~]$ host ec2-54-211-219-218.us-west-2.compute.amazonaws.com
    ec2-54-211-219-218.us-west-2.compute.amazonaws.com has address
    [[email protected] ~]$
    (IP addresses changed to protect the innocents :-) )

    What this means is that If an instance at EC2 were to access myhost.mydomain.com, route53 would return it the internal IP. But if you were to access myhost.mydomain.com from home/work, you would get back the public IP of the instance. Once you have this in place, you could always refer to myhost.mydomain.com and not bother about having to locate the “real” IP of the instance as your automation would take care of it.

    “you give route 53 service as an example. Could it be some other, non-amazon and free DNS web service used instead? any suggestions?”
    - I’ve used Route 53, Dynect, Zerigo and my-own-bind boxes … DNSMadeeasy, dnsimple too – Some may cost you. I like route53 – its cheap at $1 per zone per month and a pretty good api.

    “Could the same be done using bash/perl scripts” – Oh sure! Nothing you cant do in Perl :-) Examples would vary based on which dns you use… CPAN has a nice set of provider specific perl modules for a bunch of dns services… Also lookup stackoverflow and github gists for starters.

  • Msciwoj

    Thanks a lot for the detailed explanation!
    Now the hardest part, I actually need to try and build my own scripts for that.
    One question before as I still have doubts:
    - would it really be possible to use DNS other thatn Route53 for this? I mean, it “magically” resolves DNS queries depending on the network the requests originates from. Isn’t it the case that it can only provide private IPs just because Route53 is within Amazon product, within AWS network and therefore aware of those private network IPs…
    Using other DNS would only return public IP which is not the point of the whole excercise, no?

  • http://www.onepwr.org/about Srinivasan Mohan

    Sure – You could use any dns provider for your custom domain. All you need to do is to ensure is that on this dns provider, you setup abc.yourdomain.com as a CNAME to ec2-a-b-c-d.amazonaws.com… That way abc.yourdomain will resolve to ec2-a-b-c-d… (your dns provider will return that record). And ec2-a-b-c-d* will resolve to public IP a.b.c.d from outside and the instances private IP from inside EC2 (Amazons Route53 will return that response – Nothing you need to do).

  • http://olicarbo.me Olivier Carbonneau

    Thanks! You saved me a lot of time! I just added something at the end of the chef receipe:

    ohai ‘reload’ do
    action :reload

    This way the automatic value stored in node[fqdn] gets reloaded and is now set to the new value. I include the receipe using include_recipe and I need the value updated so the remainder of my receipe can access it without any issue. This is not required if the receipe run alone because the value in node[fqdn] is loaded automically when the execution start.

    Hope my comment could help someone with the same setup!

Powered by WordPress