AppScale

Merge lp:~cgb-cs/appscale/main-consistency into lp:appscale

main-consistency
Merge into appscale-main

Proposed by Chris Bunch on 2011-09-30

Status:	Rejected
Rejected by:	Navraj Chohan on 2012-02-14
Proposed branch:	lp:~cgb-cs/appscale/main-consistency
Merge into:	lp:appscale
Diff against target:	446 lines (+158/-42) 8 files modified AppController/djinn.rb (+64/-10) AppController/lib/repo.rb (+21/-9) AppDB/cassandra/cassandra_helper.rb (+31/-0) AppDB/cassandra/py_cassandra.py (+23/-10) AppDB/voldemort/voldemort_helper.rb (+2/-2) AppServer/demos/therepo/repo.py (+2/-2) Neptune/neptune.rb (+8/-5) Neptune/ssa_helper.rb (+7/-4)
To merge this branch:	bzr merge lp:~cgb-cs/appscale/main-consistency
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Navraj Chohan (community)		2011-09-30	Disapprove on 2012-02-14
Review via email: mp+77709@code.launchpad.net

Description of the change

Adds the ability to specify consistency levels for Cassandra - tools-consistency branch also required for this.

lp:~cgb-cs/appscale/main-consistency updated on 2011-10-11

787. By Chris Bunch on 2011-10-04: refactored get_nearest_ip to ping all db nodes and return the one that responded fastest, instead of just picking one at random
788. By Chris Bunch on 2011-10-05: fixed repo app to store and retrieve binary items with appdb, and restored repo template, which was corrupted by a previous revision. also refactored get_nearest_db_ip to default to the first db node in case other machines dont respond to pings
789. By Chris Bunch on 2011-10-05: re-enabling hybrid cloud Neptune jobs and removing output from SSA jobs properly now
790. By Chris Bunch on 2011-10-07: fixed copy-pasta'd code used for hybrid cloud deployments
791. By Chris Bunch on 2011-10-11: added documentation, installing nslookup on lucid, and telling ssa helper to log when it is done running a job and not report individual run times (as there can be up to 1 million of them)

Revision history for this message

Navraj Chohan (nchohan) on 2012-02-14:

review: Disapprove

Unmerged revisions

791. By Chris Bunch on 2011-10-11: added documentation, installing nslookup on lucid, and telling ssa helper to log when it is done running a job and not report individual run times (as there can be up to 1 million of them)
790. By Chris Bunch on 2011-10-07: fixed copy-pasta'd code used for hybrid cloud deployments
789. By Chris Bunch on 2011-10-05: re-enabling hybrid cloud Neptune jobs and removing output from SSA jobs properly now
788. By Chris Bunch on 2011-10-05: fixed repo app to store and retrieve binary items with appdb, and restored repo template, which was corrupted by a previous revision. also refactored get_nearest_db_ip to default to the first db node in case other machines dont respond to pings
787. By Chris Bunch on 2011-10-04: refactored get_nearest_ip to ping all db nodes and return the one that responded fastest, instead of just picking one at random
786. By Chris Bunch on 2011-09-30: added ability to alter cassandra's read and write consistency settings

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Chris Bunch

Dilip S M

Navraj Chohan

NicNeo

malloc

 === modified file 'AppController/djinn.rb'
 --- AppController/djinn.rb	2011-09-18 03:02:52 +0000
 +++ AppController/djinn.rb	2011-10-11 17:59:26 +0000
@@ -98,11 +98,16 @@
    public
++  # This method is exposed via SOAP and lets the tools and other AppControllers
++  # know if this node is done starting up all the services it runs.
    def done(secret)
      return BAD_SECRET_MSG unless valid_secret?(secret)
      return @done_loading
    end
++  # This method is exposed via SOAP and is used to stop all the services in a
++  # given node for Xen and KVM deployments. In cloud deployments, this method
++  # kills all the nodes across all clouds.
    def kill(secret)
      return BAD_SECRET_MSG unless valid_secret?(secret)
      @kill_sig_received = true
@@ -155,6 +160,13 @@
      return "OK"
    end
++  # This method is exposed via SOAP and is called to initialize an
++  # AppController. This specifically involves telling it who else is running
++  # in the system (djinn_locations) and what they are working on, how to
++  # access the datastore (database_credentials), and the list of applications
++  # that should be loaded (app_names). The tools will call this method on the
++  # first node in the system (the master), and the master will call this method
++  # on the other nodes once they are ready.
    def set_parameters(djinn_locations, database_credentials, app_names, secret)
      return BAD_SECRET_MSG unless valid_secret?(secret)
@@ -212,6 +224,8 @@
      return "OK"
    end
++  # Tells an AppController which applications should be loaded (via their
++  # application IDs).
    def set_apps(app_names, secret)
      return BAD_SECRET_MSG unless valid_secret?(secret)
@@ -649,8 +663,34 @@
        # If there is a local database then use it
        local_ip
      else
--      # Otherwise just select one randomly
--      db_ips.sort_by { rand }[0]
++      # Otherwise, ping all of them five times and pick the one that responded
++      # the fastest. Note that this can cause problems if a node doesn't respond
++      # to pings.
++      num_times_to_ping = 5
++      timeout = 10  # seconds
++
++      Djinn.log_debug("Finding the fastest DB node to use")
++
++      fastest_ip = db_ips[0]
++      fastest_time = INFINITY
++      db_ips.each { |ip|
++        ping_data = `ping #{ip} -c #{num_times_to_ping} -w #{timeout}`
++        times = ping_data.scan(/time=(.*) ms/).flatten
++        times = times.map! { |time| Float(time) }  # convert Strings to Floats
++        sum = times.reduce(0.0) { |sum, val| sum + val}
++        avg = sum / times.length
++        if avg < fastest_time
++          Djinn.log_debug("Found faster node: #{ip} responded in #{avg} ms")
++          fastest_ip = ip
++          fastest_time = avg
++        end
++      }
++
++      Djinn.log_debug("The node that responded the fastest out of the ips " +
++        "[#{db_ips.join(', ')}] was #{fastest_ip}, which responded in " +
++        "#{fastest_time} msec")
++
++      return fastest_ip
      end
    end
@@ -1003,9 +1043,9 @@
      end
      @nodes.each { |node|
--      #pub = location.public_ip
++      #pub = node.public_ip
        #if pub =~ /#{FQDN_REGEX}/
--      #  location.public_ip = HelperFunctions.convert_fqdn_to_ip(pub)
++      #  node.public_ip = HelperFunctions.convert_fqdn_to_ip(pub)
        #end
        pri = node.private_ip
@@ -1013,7 +1053,7 @@
          begin
            node.private_ip = HelperFunctions.convert_fqdn_to_ip(pri)
          rescue Exception => e
--          node.private_ip = location.public_ip
++          node.private_ip = node.public_ip
          end
        end
+     }
@@ -1178,10 +1218,14 @@
     # for neptune jobs, start a place where they can save output to
     # also, since repo does health checks on the app engine apis, start it up there too
--   repo_ip = get_shadow.public_ip
--   repo_private_ip = get_shadow.private_ip
--   repo_ip = my_node.public_ip if my_node.is_appengine?
--   repo_private_ip = my_node.private_ip if my_node.is_appengine?
++   if my_node.is_appengine?
++     repo_ip = my_node.public_ip
++     repo_private_ip = my_node.private_ip
++   else
++     repo_ip = get_shadow.public_ip
++     repo_private_ip = get_shadow.private_ip
++   end
++
     Repo.init(repo_ip, repo_private_ip,  @@secret)
     if my_node.is_shadow? or my_node.is_appengine?
@@ -1503,9 +1547,19 @@
      # Invoke datastore helper function
      setup_db_config_files(master_ip, slave_ips, @creds)
++    # lucid doesn't have nslookup - for now just install it
++    Djinn.log_run("apt-get install -y dnsutils")
++
      all_nodes = ""
++    IP_REGEX = /\d+\.\d+\.\d+\.\d+/
      @nodes.each_with_index { |node, index|
--      all_nodes << "#{node.private_ip} appscale-image#{index}\n"
++      ip_to_use = node.private_ip
++      if ip_to_use !~ IP_REGEX
++        Djinn.log_debug("[etc hosts] #{ip_to_use} wasn't an IP address. converting...")
++        ip_to_use =`nslookup #{ip_to_use}`.scan(/Address: (#{IP_REGEX})/).flatten.to_s
++        Djinn.log_debug("[etc hosts] #{node,private_ip} was converted to #{ip_to_use}")
++      end
++      all_nodes << "#{ip_to_use} appscale-image#{index}\n"
+     }
      etc_hosts = "/etc/hosts"
 === modified file 'AppController/lib/repo.rb'
 --- AppController/lib/repo.rb	2011-08-10 03:22:11 +0000
 +++ AppController/lib/repo.rb	2011-10-11 17:59:26 +0000
@@ -127,8 +127,16 @@
    def self.get(key, type, storage, creds, is_file=false)
      if storage == "appdb"
--      result = `curl http://#{@@ip}:8079/get -X POST -d 'SECRET=#{@@secret}' -d 'KEY=#{key}' -d 'TYPE=#{type}'`
--      result = URI.unescape(result)
++      Djinn.log_debug("performing a get on key [#{key}], type [#{type}]")
++      get_url = "http://#{@@ip}:8079/get"
++      params = {'SECRET' => @@secret, 'KEY' => key, 'TYPE' => type}
++      data = Net::HTTP.post_form(URI.parse(get_url), params).body
++      decoded_data = Base64.decode64(data)
++
++      if is_file
++        HelperFunctions.write_file(is_file, decoded_data)
++      end
++      result = decoded_data
      elsif storage == "s3"
        conn = self.get_s3_conn(creds)
        bucket, file = self.parse_s3_key(key)
@@ -179,20 +187,24 @@
            result = false
            begin
--            res = Net::HTTP.post_form(URI.parse("http://#{@@ip}:8079/set"),
--                              {'SECRET' => @@secret, 'KEY' => key,
--                               'VALUE' => val, 'TYPE' => type})
++            encoded_val = Base64.encode64(val)
++            set_url = "http://#{@@ip}:8079/set"
++            params = {'SECRET' => @@secret, 'KEY' => key,
++                      'VALUE' => encoded_val, 'TYPE' => type}
++            res = Net::HTTP.post_form(URI.parse(set_url), params)
              Djinn.log_debug("set key=#{key} type=#{type} returned #{res.body}")
              result = true if res.body == "success"
            rescue Exception => e
              Djinn.log_debug("saw exception #{e.class} when posting userdata to repo at #{key}")
            end
--
          end
        else
--          Djinn.log_debug("attempting to put local file #{val} into bucket #{bucket}, location #{file}")
--          val = URI.escape(val, Regexp.new("[^#{URI::PATTERN::UNRESERVED}]"))
--          result = `curl http://#{@@ip}:8079/set -X POST -d 'SECRET=#{@@secret}' -d 'KEY=#{key}' -d 'VALUE=#{val}' -d 'TYPE=#{type}'`
++          Djinn.log_debug("attempting to put local file into location #{key}")
++          encoded_val = Base64.encode64(val)
++          set_url = "http://#{@@ip}:8079/set"
++          params = {'SECRET' => @@secret, 'KEY' => key,
++                    'VALUE' => encoded_val, 'TYPE' => type}
++          result = Net::HTTP.post_form(URI.parse(set_url), params).body
            Djinn.log_debug("set key=#{key} type=#{type} returned #{result}")
            result = true if result == "success"
        end
 === modified file 'AppDB/cassandra/cassandra_helper.rb'
 --- AppDB/cassandra/cassandra_helper.rb	2011-05-30 01:04:15 +0000
 +++ AppDB/cassandra/cassandra_helper.rb	2011-10-11 17:59:26 +0000
@@ -36,6 +36,37 @@
+       }
+     }
+   }
++
++  setup_consistency_settings(creds)
++end
++
++# Writes the Cassandra interface file's read and write consistency levels.
++# Requires the parameter 'creds' to have defined at least a read or write
++# policy - if these are not defined, the default value in py_cassandra.py
++# is used.
++
++def setup_consistency_settings(creds)
++  return if creds["read_factor"].nil? and creds["write_factor"].nil?
++  cassandra_interface = "#{APPSCALE_HOME}/AppDB/cassandra/py_cassandra.py"
++
++  contents = ""
++  File.open(cassandra_interface) { |source_file|
++    contents = source_file.read
++
++    if !creds["read_factor"].nil?
++      read_factor = "CONSISTENCY_#{creds['read_factor']}"
++      contents.gsub!(/READ_FACTOR = (.*)/, "READ_FACTOR = #{read_factor}")
++    end
++
++    if !creds["write_factor"].nil?
++      write_factor = "CONSISTENCY_#{creds['write_factor']}"
++      contents.gsub!(/WRITE_FACTOR = (.*)/, "WRITE_FACTOR = #{write_factor}")
++    end
++  }
++
++  File.open(cassandra_interface, "w+") { |dest_file|
++    dest_file.write(contents)
++  }
  end
  def start_db_master()
 === modified file 'AppDB/cassandra/py_cassandra.py'
 --- AppDB/cassandra/py_cassandra.py	2011-05-31 00:01:41 +0000
 +++ AppDB/cassandra/py_cassandra.py	2011-10-11 17:59:26 +0000
@@ -38,10 +38,19 @@
  DEFAULT_HOST = "localhost"
  DEFAULT_PORT = 9160
--#CONSISTENCY_ZERO = 0 # don't use this for reads
++
++CONSISTENCY_ZERO = 0 # don't use this for reads
++CONSISTENCY_ANY = pycassa.cassandra.ttypes.ConsistencyLevel.ANY
  CONSISTENCY_ONE = pycassa.cassandra.ttypes.ConsistencyLevel.ONE
  CONSISTENCY_QUORUM = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM
--#CONSISTENCY_ALL = 5 # don't use this for reads (next version may fix this)
++CONSISTENCY_ALL = pycassa.cassandra.ttypes.ConsistencyLevel.ALL # don't use this for reads (next version may fix this)
++
++# These lines should not be removed: the AppController will replace these
++# values if the user specifies it via the command-line. These default values
++# are chosen to favor reads but keep strong consistency (consistent with the
++# 20:1 read/write ratio cited by the Megastore paper).
++READ_FACTOR = CONSISTENCY_ONE
++WRITE_FACTOR = CONSISTENCY_ALL
  MAX_ROW_COUNT = 10000000
  table_cache = {}
@@ -69,8 +78,8 @@
        path = ColumnPath(COLUMN_FAMILY)
        client = self.__setup_connection()
        # Result is a column type which has name, value, timestamp
--      result = client.get_slice(row_key, path, slice_predicate,
--                                 CONSISTENCY_QUORUM)
++      global READ_FACTOR
++      result = client.get_slice(row_key, path, slice_predicate, READ_FACTOR)
        for column in column_names:
          for r in result:
            c = r.column
@@ -111,7 +120,8 @@
        mutation = Mutation(column_or_supercolumn=c_or_sc)
        mutations.append(mutation)
      mutation_map = {row_key : { COLUMN_FAMILY : mutations } }
--    client.batch_mutate(mutation_map, CONSISTENCY_QUORUM)
++    global WRITE_FACTOR
++    client.batch_mutate(mutation_map, WRITE_FACTOR)
      """except Exception, ex:
        print "EXCEPTION"
        self.logger.debug("Exception %s" % ex)
@@ -137,10 +147,11 @@
      end_key = table_name + '/~'
      try:
        cf = pycassa.ColumnFamily(self.pool, 'Standard1')
++      global READ_FACTOR
        keyslices = cf.get_range(columns=column_names,
                                start=start_key,
                                finish=end_key,
--                              read_consistency_level=CONSISTENCY_QUORUM)
++                              read_consistency_level=READ_FACTOR)
        keyslices = list(keyslices)
      except Exception, ex:
        self.logger.debug("Exception %s" % ex)
@@ -173,8 +184,8 @@
        client = self.__setup_connection()
        curtime = self.timestamp()
        # Result is a column type which has name, value, timestamp
--      client.remove(row_key, path, curtime,
--                               CONSISTENCY_QUORUM)
++      global WRITE_FACTOR
++      client.remove(row_key, path, curtime, WRITE_FACTOR)
      except Exception, ex:
        self.logger.debug("Exception %s" % ex)
        ret[0]+=("Exception: %s"%ex)
@@ -213,10 +224,11 @@
      end_key = table_name + '/~'
      try:
        cf = pycassa.ColumnFamily(self.pool, 'Standard1')
++      global READ_FACTOR
        keyslices = cf.get_range(columns=[],
                                start=start_key,
                                finish=end_key,
--                              read_consistency_level=CONSISTENCY_QUORUM)
++                              read_consistency_level=READ_FACTOR)
      except Exception, ex:
        self.logger.debug("Exception %s" % ex)
        result[0]+=("Exception: %s"%ex)
@@ -225,10 +237,11 @@
      for keyslice in keyslices:
        row_key = keyslice[0]
        client = self.__setup_connection()
++      global WRITE_FACTOR
        client.remove(row_key,
                      path,
                      curtime,
--                    CONSISTENCY_QUORUM)
++                    WRITE_FACTOR)
        keys_removed = True
      if table_name not in table_cache and keys_removed:
        result[0] += "Table does not exist"
 === modified file 'AppDB/voldemort/voldemort_helper.rb'
 --- AppDB/voldemort/voldemort_helper.rb	2010-05-27 21:59:59 +0000
 +++ AppDB/voldemort/voldemort_helper.rb	2011-10-11 17:59:26 +0000
@@ -39,8 +39,8 @@
    # TODO: this should not use djinn class field.
    setup_cluster_config(voldemort_conf_loc, database_nodes)
    setup_server_config(voldemort_server_template, voldemort_conf_loc, my_db_id)
--  r = creds["voldemortr"]
--  w = creds["voldemortw"]
++  r = creds["read_factor"]
++  w = creds["write_factor"]
    setup_stores_config(voldemort_stores_temp, voldemort_stores_loc, creds["replication"], r, w)
  end # setup
 === modified file 'AppServer/demos/therepo/repo.py'
 --- AppServer/demos/therepo/repo.py	2011-06-28 21:08:21 +0000
 +++ AppServer/demos/therepo/repo.py	2011-10-11 17:59:26 +0000
@@ -31,7 +31,7 @@
  import logging
--SECRET = "SeHIb1ctOKWJ3RyBLPL1dE0XqJe52dMZ"
++SECRET = "PLACE SECRET HERE"
  NO_SECRET = "you failed to provide a secret"
  BAD_SECRET = "you provided a bad secret"
@@ -134,7 +134,7 @@
      if type == "output":
        entry = Entry(key_name = key)
--      entry.content = db.Blob(value)
++      entry.content = db.Blob(str(value))
        entry.acl = "private"
      else: # type is acl
        entry = Entry.get_by_key_name(key)
 === modified file 'Neptune/neptune.rb'
 --- Neptune/neptune.rb	2011-06-07 01:42:50 +0000
 +++ Neptune/neptune.rb	2011-10-11 17:59:26 +0000
@@ -33,7 +33,7 @@
    Djinn.log_debug("got run request - #{job_data.inspect}")
    prejob_status = can_run_job(job_data)
--  Djinn.log_debug("Pre-job status for job_data [#{job_data}] is [#{prejob_status}]")
++  Djinn.log_debug("Pre-job status for job_data [#{job_data.inspect}] is [#{prejob_status}]")
    unless prejob_status == :ok
      return prejob_status
    end
@@ -62,6 +62,10 @@
      end
      code = job_data['@code']
++    if code.nil?  # e.g., in SSA runs, where the code is specified via '@tar'
++      code = job_data['@tar']
++    end
++
      dirs = code.split(/\//)
      code_dir = dirs[0, dirs.length-1].join("/")
@@ -463,7 +467,8 @@
      Djinn.log_debug("acquiring nodes for hybrid cloud neptune job")
      if nodes_needed.class == Array
--      nodes_needed = Hash[nodes_needed]
++      Djinn.log_debug("received array with contents: #{nodes_needed.join(', ')}")
++      nodes_needed = Hash[*nodes_needed]
        Djinn.log_debug("request received to spawn hybrid nodes: #{nodes_needed.inspect}")
      else
        Djinn.log_debug("nodes_needed was not the right class - should have been Array but was #{nodes_needed.class}")
@@ -565,9 +570,7 @@
  end
  def neptune_release_nodes(nodes_to_use, job_data)
--  if is_hybrid_cloud?
--    abort("hybrid cloud mode is definitely not supported")
--  elsif is_cloud?
++  if is_hybrid_cloud? or is_cloud?
      nodes_to_use.each { |node|
        node.set_roles("open")
+     }
 === modified file 'Neptune/ssa_helper.rb'
 --- Neptune/ssa_helper.rb	2011-06-02 03:35:37 +0000
 +++ Neptune/ssa_helper.rb	2011-10-11 17:59:26 +0000
@@ -89,7 +89,10 @@
          loop {
            trajectories_left = trajectories - done
            Djinn.log_debug("Need to run #{trajectories_left} more trajectories on #{cores} cores")
--          break if trajectories_left.zero?
++          if trajectories_left.zero?
++            Djinn.log_debug("Done running trajectories!")
++            break
++          end
            need_to_run = [trajectories_left, cores].min
            Djinn.log_debug("Running #{need_to_run} trajectories")
@@ -167,10 +170,10 @@
      TIMING: stddev compute time is #{standard_deviation(c_times)} seconds.
      TIMING: average storage time is #{average(s_times)} seconds.
      TIMING: stddev storage time is #{standard_deviation(s_times)} seconds.
--    RAW_DATA: node times are: [#{node_times.join(', ')}]
--    RAW_DATA: compute times are: [#{c_times.join(', ')}]
--    RAW_DATA: storage times are: [#{s_times.join(', ')}]
  BAZ
++    #RAW_DATA: node times are: [#{node_times.join(', ')}]
++    #RAW_DATA: compute times are: [#{c_times.join(', ')}]
++    #RAW_DATA: storage times are: [#{s_times.join(', ')}]
      Djinn.log_debug(timing_info)