Merge lp:~nuclearbob/utah/cobbler-pidlock into lp:utah

Proposed by Max Brustkern
Status: Rejected
Rejected by: Max Brustkern
Proposed branch: lp:~nuclearbob/utah/cobbler-pidlock
Merge into: lp:utah
Diff against target: 90 lines (+44/-26)
1 file modified
utah/provisioning/inventory/sqlite.py (+44/-26)
To merge this branch: bzr merge lp:~nuclearbob/utah/cobbler-pidlock
Reviewer Review Type Date Requested Status
Max Brustkern (community) Disapprove
Review via email: mp+129046@code.launchpad.net

Description of the change

The original cobbler inventory used 'available' and 'provisioned' as states. If a job timed out and left the machine in the 'provisioned' state, there's no automatic process to recover that. This changes the inventory to use the pid of the process requesting the machine as the state, and if a machine is requested and that pid is no longer active on the system, the machine will be claimed anyway. If the listed pid is still active, the machine will not be used. I've tested it in magners-orchestra with the dx team's jenkins job. It could cause problems if multiple provisioning hosts were trying to use the same sqlite database to manage physical machines, but I think that scenario would introduce other problems as well. If we're dealing with multiple hosts trying to provision the same pool of machines, a more robust solution than sqlite is probably warranted.

To post a comment you must log in.
Revision history for this message
Joe Talbott (joetalbott) wrote :

Keep in mind that on a highly loaded machine PIDs get recycled so you might have a whole different process getting an available PID that is in the inventory database.

Revision history for this message
Javier Collado (javier.collado) wrote :

To take Joe's comment into account, you might want to check the process command line:

if psutil.pid_exists(pid) and 'run_utah_test.py' in '.join(psutil.Process(pid).cmdline):
    <machine is being used>

utah already depends on psutil, so this won't add an extra dependency to the package.

Revision history for this message
Javier Collado (javier.collado) wrote :

One more comment is that if you plan to use the pid to track machine reservation,
then the column name in the table should be renamed to pid.

lp:~nuclearbob/utah/cobbler-pidlock updated
708. By Max Brustkern

Adding command line check

709. By Max Brustkern

Added check for "run_cobbler_test.py"

710. By Max Brustkern

Changed run_cobbler_test to run_test_cobbler

Revision history for this message
Max Brustkern (nuclearbob) wrote :

I updated the check to look for "utah" or "run_test_cobbler.py" on the command line. I'm testing now. The thing about changing the name of the field is that it still could be "available" which isn't a pid. What's a good name that indicates it can either be a state or a pid? Or can we change available to some number that will never be a pid, like -1?

Revision history for this message
Joe Talbott (joetalbott) wrote :

On Thu, Oct 11, 2012 at 01:24:20PM -0000, Max Brustkern wrote:
> I updated the check to look for "utah" or "run_test_cobbler.py" on the command line. I'm testing now. The thing about changing the name of the field is that it still could be "available" which isn't a pid. What's a good name that indicates it can either be a state or a pid? Or can we change available to some number that will never be a pid, like -1?

What about changing the value to 'available - <pid>'? Then you can string
compare on 'available' and easily parse out the pid for doing the check.
> --
> https://code.launchpad.net/~nuclearbob/utah/cobbler-pidlock/+merge/129046
> Your team UTAH Dev is requested to review the proposed merge of lp:~nuclearbob/utah/cobbler-pidlock into lp:utah.

Revision history for this message
Javier Collado (javier.collado) wrote :

@Max

I think this looks like two different columns are needed. One for the state
and one for the pid and that the pid is used to check if the state can be trusted.
Encoding both things in one column could lead to confusion.

Another improvement could be to have a new table named states with all the possible
states as strings and replace the state column in the machines with state_id to be
used as a foreign key.

Finally, I'm not sure if this will grow in complexity in the future, but have you
thought about using an ORM instead of write sql commands directly?

Revision history for this message
Max Brustkern (nuclearbob) wrote :

I'll rework this and resubmit later.

review: Disapprove

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'utah/provisioning/inventory/sqlite.py'
--- utah/provisioning/inventory/sqlite.py 2012-08-22 15:20:25 +0000
+++ utah/provisioning/inventory/sqlite.py 2012-10-11 13:23:20 +0000
@@ -2,6 +2,7 @@
22
3import sqlite33import sqlite3
4import os4import os
5import psutil
5from utah.provisioning.inventory.exceptions import \6from utah.provisioning.inventory.exceptions import \
6 UTAHProvisioningInventoryException7 UTAHProvisioningInventoryException
7from utah.provisioning.inventory.inventory import Inventory8from utah.provisioning.inventory.inventory import Inventory
@@ -112,35 +113,52 @@
112 else:113 else:
113 for machineinfo in result:114 for machineinfo in result:
114 cargs = dict(machineinfo)115 cargs = dict(machineinfo)
115 machineid = cargs.pop('machineid')116 if cargs['state'] == 'available':
116 name = cargs.pop('name')117 return self._take(cargs, *args, **kw)
117 state = cargs.pop('state')118 for machineinfo in result:
118 if state == 'available':119 cargs = dict(machineinfo)
119 update = self.connection.execute(120 name = cargs['name']
120 "UPDATE machines SET state='provisioned' "121 state = cargs['state']
121 "WHERE machineid=? AND state='available'",122 try:
122 [machineid]).rowcount123 pid = int(state)
123 if update == 1:124 try:
124 machine = CobblerMachine(*args, cargs=cargs,125 if not (psutil.pid_exists(pid) and ('utah' in
125 inventory=self, name=name, **kw)126 ' '.join(psutil.Process(pid).cmdline)
126 self.machines.append(machine)127 or 'run_test_cobbler.py' in
127 return machine128 ' '.join(psutil.Process(pid).cmdline))):
128 elif update == 0:129 return self._take(cargs, *args, **kw)
129 raise UTAHProvisioningInventoryException(130 except ValueError:
130 'Machine was requested by another process '131 continue
131 'before we could request it')132
132 elif update > 1:
133 raise UTAHProvisioningInventoryException(
134 'Multiple machines exist '
135 'matching those criteria; '
136 'database ' + self.db + ' may be corrupt')
137 else:
138 raise UTAHProvisioningInventoryException(
139 'Negative rowcount returned '
140 'when attempting to request machine')
141 raise UTAHProvisioningInventoryException(133 raise UTAHProvisioningInventoryException(
142 'All machines meeting criteria are currently unavailable')134 'All machines meeting criteria are currently unavailable')
143135
136 def _take(self, cargs, *args, **kw):
137 machineid = cargs.pop('machineid')
138 name = cargs.pop('name')
139 state = cargs.pop('state')
140 update = self.connection.execute(
141 "UPDATE machines SET state=? WHERE machineid=? AND state=?",
142 [os.getpid(), machineid, state]).rowcount
143 if update == 1:
144 machine = CobblerMachine(*args, cargs=cargs,
145 inventory=self, name=name, **kw)
146 self.machines.append(machine)
147 return machine
148 elif update == 0:
149 raise UTAHProvisioningInventoryException(
150 'Machine was requested by another process '
151 'before we could request it')
152 elif update > 1:
153 raise UTAHProvisioningInventoryException(
154 'Multiple machines exist '
155 'matching those criteria; '
156 'database ' + self.db + ' may be corrupt')
157 else:
158 raise UTAHProvisioningInventoryException(
159 'Negative rowcount returned '
160 'when attempting to request machine')
161
144 def release(self, machine=None, name=None):162 def release(self, machine=None, name=None):
145 if machine is not None:163 if machine is not None:
146 name = machine.name164 name = machine.name

Subscribers

People subscribed via source and target branches