Merge lp:~vasilev/psiphon/sprint3-457431 into lp:psiphon

Proposed by Robert Vasilev
Status: Merged
Approved by: Adam Kruger
Approved revision: 123
Merged at revision: 126
Proposed branch: lp:~vasilev/psiphon/sprint3-457431
Merge into: lp:psiphon
Diff against target: 193 lines (+143/-2)
4 files modified
trunk/cronjobs/cleanup_unused_accounts_cron.php (+95/-0)
trunk/cronjobs/crontab (+1/-1)
trunk/www/config.php (+12/-0)
trunk/www/includes/database_helpers.php (+35/-1)
To merge this branch: bzr merge lp:~vasilev/psiphon/sprint3-457431
Reviewer Review Type Date Requested Status
Adam Kruger Approve
Rod Pending
Review via email: mp+35397@code.launchpad.net

This proposal supersedes a proposal from 2010-08-31.

Description of the change

Fixes for https://bugs.launchpad.net/psiphon/+bug/457431

1. Added cleanup_unused_accounts_cron.php script for deleting inactive user accounts and related data (bookmarks, cookies, etc).
2. Added 3 configuration directives in config.php (inactivity periods for non-disabled and disabled users, and affected user classes list).
3. Updated crontab sample to invoke cleanup_unused_accounts_cron.php.

Upd:
Rod's comments have been implemented.

To post a comment you must log in.
Revision history for this message
Rod (rod-psiphon) wrote : Posted in a previous version of this proposal

1. Should not hard code user class numbers:

$config["unused_account_cleanup_affected_user_classes"] = array(2, 3, 4, 6, 7);

In includes/user_class_helpers.php, we have helper functions like get_power_user_class_id().

2. If the series of DELETE statements in delete_user_and_related_data fails in the middle (e.g., server crash) then we could be left with bookmark, invite, userproxy, etc. referring to non-existing user. This is probably bad.

Recommend doing "DELETE FROM user" last instead of first to minimize the risk of inconsistency in the database (we can't guarantee consistency with the MyISAM engine which lacks transactions)/

3. It appears that the SQL works like this:

php array = SELECT userid WHERE ...
DELETE WHERE id in (php array rendered as string) [repeat 6 times]

We recommend this pattern instead:

DELETE WHERE id in (SELECT userid WHERE ...) [repeat 6 times]

Reasoning: there is far less data sent between PHP and the database engine in the second approach; we don't know how big the user id list could be; MySQL ought to cache the results of the SELECT so redoing it in each DELETE should not incur full overhead.

review: Needs Fixing
Revision history for this message
Rod (rod-psiphon) : Posted in a previous version of this proposal
review: Approve
Revision history for this message
Adam Kruger (adam-kruger) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'trunk/cronjobs/cleanup_unused_accounts_cron.php'
2--- trunk/cronjobs/cleanup_unused_accounts_cron.php 1970-01-01 00:00:00 +0000
3+++ trunk/cronjobs/cleanup_unused_accounts_cron.php 2010-09-14 12:50:56 +0000
4@@ -0,0 +1,95 @@
5+<?
6+/*
7+ Psiphon Circumvention Platform
8+ Copyright (C) 2009 Psiphon Inc.
9+
10+ This program is free software: you can redistribute it and/or modify
11+ it under the terms of the GNU General Public License as published by
12+ the Free Software Foundation, either version 3 of the License, or
13+ (at your option) any later version.
14+
15+ This program is distributed in the hope that it will be useful,
16+ but WITHOUT ANY WARRANTY; without even the implied warranty of
17+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18+ GNU General Public License for more details.
19+
20+ You should have received a copy of the GNU General Public License
21+ along with this program. If not, see <http://www.gnu.org/licenses/>.
22+*/
23+
24+// Add to crontab (to have it run every 1 minute):
25+//
26+// * * * * * /usr/local/bin/php /home/ppcron/cronjobs/cleanup_unused_accounts_cron.php
27+//
28+
29+// pwd of a cronjob seems to be the cron user's (ppcron) home directory.
30+// This messes up includes, so we're going to explicitly use the directory
31+// of this file.
32+$current_directory = substr($argv[0], 0, strrpos($argv[0], "/"));
33+
34+//
35+// Remove unused user accounts.
36+// Any user account of configured classes with last_activity older than account_ttl is deleted.
37+//
38+$_SERVER["DOCUMENT_ROOT"] = $current_directory;
39+include($current_directory."/includes/user_class_helpers.php");
40+include($current_directory."/config.php");
41+include($current_directory."/includes/sql_error_cli.php");
42+include($current_directory."/includes/cr.php");
43+include($current_directory."/includes/database_helpers.php");
44+
45+// Don't run if this job is already running.
46+
47+if ($f = @fopen($config["cleanup_unused_accounts_pid_file"], "r"))
48+{
49+ $pid = (int)fgets($f);
50+ fclose($f);
51+ exit("{$argv[0]} is already running, pid {$pid}\n");
52+}
53+
54+if ($f = @fopen($config["cleanup_unused_accounts_pid_file"], "w"))
55+{
56+ $pid = getmypid();
57+ fputs($f, $pid."\n");
58+ fclose($f);
59+}
60+else
61+{
62+ echo "Can't create {$config["cleanup_unused_accounts_pid_file"]}, keep running anyway.\n";
63+}
64+
65+
66+// Construct SQL condition of user classes configured for account cleanup.
67+$param_string = "";
68+$param_array = array();
69+string_list_to_params(implode(",", $config["unused_account_cleanup_affected_user_classes"]), ",", "p_", $param_string, $param_array);
70+$query_inactive_user_ids_from_part = " FROM user WHERE
71+user.last_activity < (unix_timestamp() - :unused_account_ttl)
72+AND
73+user.locked = :locked
74+AND
75+user.grp IN (".$param_string.")
76+";
77+
78+// Cleanup unused non-disabled accounts
79+$param_array[':locked'] = 0;
80+$param_array[':unused_account_ttl'] = $config["unused_account_ttl"];
81+delete_user_and_all_related_data($config, $query_inactive_user_ids_from_part, $param_array);
82+
83+// Cleanup unused disabled accounts
84+$param_array[':locked'] = 1;
85+$param_array[':unused_account_ttl'] = $config["unused_disabled_account_ttl"];
86+delete_user_and_all_related_data($config, $query_inactive_user_ids_from_part, $param_array);
87+
88+@unlink($config["cleanup_unused_accounts_pid_file"]);
89+
90+exit(0);
91+
92+function delete_user_and_all_related_data($config, $query_inactive_user_ids_from_part, Array $param_array) {
93+ // Call function from database_helpers.php
94+ if (!delete_user_and_related_data($config, $query_inactive_user_ids_from_part, $param_array)) {
95+ @unlink($config["cleanup_unused_accounts_pid_file"]);
96+ sql_error_cli("DELETE " . $query_inactive_user_ids_from_part, $config["sql"]);
97+ }
98+}
99+?>
100
101=== modified file 'trunk/cronjobs/crontab'
102--- trunk/cronjobs/crontab 2010-06-29 20:33:54 +0000
103+++ trunk/cronjobs/crontab 2010-09-14 12:50:56 +0000
104@@ -1,4 +1,4 @@
105 3 2 * * * #dest#/php/bin/php #dest#/ppcron/cleanup_invitations_cron.php
106 4 2 * * * #dest#/php/bin/php #dest#/ppcron/cleanup_email_candidates_cron.php
107 6 5 * * * #dest#/php/bin/php #dest#/ppcron/cleanup_sessions_cron.php
108-
109+5 4 * * * #dest#/php/bin/php #dest#/ppcron/cleanup_unused_accounts_cron.php
110
111=== modified file 'trunk/www/config.php'
112--- trunk/www/config.php 2010-06-28 20:44:20 +0000
113+++ trunk/www/config.php 2010-09-14 12:50:56 +0000
114@@ -54,6 +54,17 @@
115
116 $config["session_ttl"] = 60*60; // session expires after 1 idle hour
117
118+// unused non-disabled user accounts are deleted after 2 years of inactivity
119+$config["unused_account_ttl"] = 86400 * 365 * 2;
120+
121+// unused disabled user accounts are deleted after 5 months since disabled
122+$config["unused_disabled_account_ttl"] = 86400 * 30 * 5;
123+
124+// user groups/classes affected by automatic unused account cleanup
125+$config["unused_account_cleanup_affected_user_classes"] = array(get_power_user_class_id(),
126+ get_normal_user_class_id(), get_anonymous_user_class_id(),
127+ get_propagator_user_class_id(), get_reporter_user_class_id() );
128+
129 $config[min_pass_length] = 6;
130 $config[max_login_delay] = 30; // 30 seconds
131
132@@ -66,6 +77,7 @@
133 $config["cleanup_email_candidates_pid_file"] = "/var/run/ppcron/psiphon_cleanup_email_candidates.pid";
134 $config["cleanup_invitations_pid_file"] = "/var/run/ppcron/psiphon_cleanup_invitations.pid";
135 $config["cleanup_sessions_pid_file"] = "/var/run/ppcron/psiphon_cleanup_sessions.pid";
136+$config["cleanup_unused_accounts_pid_file"] = "/var/run/ppcron/psiphon_cleanup_unused_accounts.pid";
137
138 $config["psiphon_web_resource_page"] = "https://docs.google.com/View?id=dccxjzk4_0f76vf8g8";
139
140
141=== modified file 'trunk/www/includes/database_helpers.php'
142--- trunk/www/includes/database_helpers.php 2010-06-23 19:21:47 +0000
143+++ trunk/www/includes/database_helpers.php 2010-09-14 12:50:56 +0000
144@@ -1,7 +1,7 @@
145 <?php
146 /*
147 Psiphon Circumvention Platform
148- Copyright (C) 2009 Psiphon Inc.
149+ Copyright (C) 2010 Psiphon Inc.
150
151 This program is free software: you can redistribute it and/or modify
152 it under the terms of the GNU General Public License as published by
153@@ -173,6 +173,40 @@
154 return $user_record;
155 }
156
157+/**
158+ * Deletes users and all related data
159+ *
160+ * @param $config
161+ * @param $query_affected_user_ids_from_part SQL conditions to select affected user ids with placeholders
162+ * @param $param_array placeholder values
163+ * @return bool true on success, false on error
164+ */
165+function delete_user_and_related_data($config, $query_affected_user_ids_from_part, Array $param_array) {
166+ // We don't know how big the user id list could be; So, we will use subselect here (Rod's recommendation)
167+ $query_affected_user_ids = 'SELECT user.id ' . $query_affected_user_ids_from_part;
168+
169+ $query_delete_bookmark = "DELETE FROM bookmark WHERE user IN (" . $query_affected_user_ids . ")";
170+ $query_delete_cookie = "DELETE FROM cookie WHERE user IN (" . $query_affected_user_ids . ")";
171+ $query_delete_invitation = "DELETE FROM invitation WHERE user IN (" . $query_affected_user_ids . ")";
172+ $query_delete_ticket = "DELETE FROM ticket WHERE user IN (" . $query_affected_user_ids . ")";
173+ $query_delete_userproxy = "DELETE FROM userproxy WHERE userid IN (" . $query_affected_user_ids . ")";
174+ // Delete from user table only after deleting all other data
175+ $query_delete_user = "DELETE " . $query_affected_user_ids_from_part;
176+
177+ if (
178+ !db_query_execute($config, $query_delete_bookmark, convert_null_array($param_array), true) ||
179+ !db_query_execute($config, $query_delete_cookie, convert_null_array($param_array), true) ||
180+ !db_query_execute($config, $query_delete_invitation, convert_null_array($param_array), true) ||
181+ !db_query_execute($config, $query_delete_ticket, convert_null_array($param_array), true) ||
182+ !db_query_execute($config, $query_delete_userproxy, convert_null_array($param_array), true) ||
183+ !db_query_execute($config, $query_delete_user, convert_null_array($param_array), true)
184+ ) {
185+ return false;
186+ }
187+
188+ return true;
189+}
190+
191 /* Turns a (comma-separated, usually) list of values into a values that can be used in a
192 * parameterized SQL query.
193 *

Subscribers

People subscribed via source and target branches