Merge lp:~ben-hutchings/ensoft-sextant/filter-search into lp:ensoft-sextant

Proposed by Ben Hutchings
Status: Superseded
Proposed branch: lp:~ben-hutchings/ensoft-sextant/filter-search
Merge into: lp:ensoft-sextant
Prerequisite: lp:~ben-hutchings/ensoft-sextant/autocomplete-fix
Diff against target: 696 lines (+239/-112)
8 files modified
resources/sextant/web/interface.html (+2/-2)
src/sextant/__main__.py (+8/-5)
src/sextant/db_api.py (+118/-59)
src/sextant/export.py (+1/-1)
src/sextant/objdump_parser.py (+82/-33)
src/sextant/test_parser.py (+1/-1)
src/sextant/update_db.py (+15/-8)
src/sextant/web/server.py (+12/-3)
To merge this branch: bzr merge lp:~ben-hutchings/ensoft-sextant/filter-search
Reviewer Review Type Date Requested Status
Robert Approve
Review via email: mp+242182@code.launchpad.net

This proposal supersedes a proposal from 2014-11-19.

This proposal has been superseded by a proposal from 2014-11-21.

Commit message

Function name search within the web frontend now supports extended syntax:
'<name matches>:<file path matches>'
where name matches and file path matches are (possibly) comma separated lists, and may include wildcards '.*'. At least one of the two must be specified.

Fixed bug with inline functions being uploaded multiple times into the database.
Fixed bug with over-zealous name stripping of function identifiers.
Fixed bug by which some functions were not uploaded.

Note: some tests do not pass - this is not because it doesn't work! Some details have changed under the hood which require the tests to be changed.

Description of the change

Function name search within the web frontend now supports extended syntax:
'<name matches>:<file path matches>'
where name matches and file path matches are (possibly) comma separated lists, and may include wildcards '.*'. At least one of the two must be specified.

Fixed bug with inline functions being uploaded multiple times into the database.
Fixed bug with over-zealous name stripping of function identifiers.
Fixed bug by which some functions were not uploaded.

Note: some tests do not pass - this is not because it doesn't work! Some details have changed under the hood which require the tests to be changed.

To post a comment you must log in.
Revision history for this message
Robert (rjwills) :
review: Approve
Revision history for this message
Martin Morrison (isoschiz) wrote :

The prerequisite lp:~ben-hutchings/ensoft-sextant/autocomplete-fix has not yet been merged into lp:ensoft-sextant.

48. By Ben Hutchings

merge from autocomplete-fix

49. By Ben Hutchings

another merge from autocomplete-fix

50. By Ben Hutchings

fixed bug causing extrac characters to be removed from the start of symbol names

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'resources/sextant/web/interface.html'
--- resources/sextant/web/interface.html 2014-11-21 12:34:24 +0000
+++ resources/sextant/web/interface.html 2014-11-21 12:34:25 +0000
@@ -27,8 +27,8 @@
27 All functions calling specific function</option>27 All functions calling specific function</option>
28 <option value="functions_called_by">28 <option value="functions_called_by">
29 All functions called by a specific function</option>29 All functions called by a specific function</option>
30 <option value="all_call_paths">30 <!--option value="all_call_paths"> REMOVED AS THIS IS SLOW FOR IOS
31 All function call paths between two functions</option>31 All function call paths between two functions</option-->
32 <option value="shortest_call_path">32 <option value="shortest_call_path">
33 Shortest path between two functions</option>33 Shortest path between two functions</option>
34 <option value="function_names">34 <option value="function_names">
3535
=== modified file 'src/sextant/__main__.py'
--- src/sextant/__main__.py 2014-10-17 15:30:14 +0000
+++ src/sextant/__main__.py 2014-11-21 12:34:25 +0000
@@ -127,16 +127,13 @@
127 except TypeError:127 except TypeError:
128 alternative_name = None128 alternative_name = None
129129
130 not_object_file = args.not_object_file
131 # the default is "yes, this is an object file" if not-object-file was
132 # unsupplied
133
134 try:130 try:
135 update_db.upload_program(connection, 131 update_db.upload_program(connection,
136 getpass.getuser(),132 getpass.getuser(),
137 args.input_file,133 args.input_file,
138 alternative_name,134 alternative_name,
139 not_object_file)135 args.not_object_file,
136 args.add_file_paths)
140 except requests.exceptions.ConnectionError as e:137 except requests.exceptions.ConnectionError as e:
141 msg = 'Connection error to server {}: {}'138 msg = 'Connection error to server {}: {}'
142 logging.error(msg.format(_displayable_url(args), e))139 logging.error(msg.format(_displayable_url(args), e))
@@ -221,6 +218,12 @@
221 help='default False, if the input file is an '218 help='default False, if the input file is an '
222 'object to be disassembled',219 'object to be disassembled',
223 action='store_true')220 action='store_true')
221 parsers['add'].add_argument('--add-file-paths',
222 help='default False, set to True to make objdump '
223 'extract the file paths for each function. '
224 'WARNING: this is SLOW for large object files, '
225 '~15 hours for IOS.',
226 action='store_true')
224227
225 parsers['delete'] = subparsers.add_parser('delete-program',228 parsers['delete'] = subparsers.add_parser('delete-program',
226 help="delete a program from the database")229 help="delete a program from the database")
227230
=== modified file 'src/sextant/db_api.py'
--- src/sextant/db_api.py 2014-11-21 12:34:24 +0000
+++ src/sextant/db_api.py 2014-11-21 12:34:25 +0000
@@ -159,7 +159,7 @@
159 tmp_path = os.path.join(self._tmp_dir, '{}_{{}}'.format(program_name))159 tmp_path = os.path.join(self._tmp_dir, '{}_{{}}'.format(program_name))
160160
161 self.func_writer = CSVWriter(tmp_path.format('funcs'), 161 self.func_writer = CSVWriter(tmp_path.format('funcs'),
162 headers=['name', 'type'], 162 headers=['name', 'type', 'file'],
163 max_rows=5000)163 max_rows=5000)
164 self.call_writer = CSVWriter(tmp_path.format('calls'), 164 self.call_writer = CSVWriter(tmp_path.format('calls'),
165 headers=['caller', 'callee'], 165 headers=['caller', 'callee'],
@@ -171,7 +171,7 @@
171 ' WITH line, toInt(line.id) as lineid'171 ' WITH line, toInt(line.id) as lineid'
172 ' MATCH (n:program {{name: "{}"}})'172 ' MATCH (n:program {{name: "{}"}})'
173 ' CREATE (n)-[:subject]->(m:func {{name: line.name,'173 ' CREATE (n)-[:subject]->(m:func {{name: line.name,'
174 ' id: lineid, type: line.type}})')174 ' id: lineid, type: line.type, file: line.file}})')
175 175
176 self.add_call_query = (' USING PERIODIC COMMIT 250'176 self.add_call_query = (' USING PERIODIC COMMIT 250'
177 ' LOAD CSV WITH HEADERS FROM "file:{}" AS line'177 ' LOAD CSV WITH HEADERS FROM "file:{}" AS line'
@@ -203,7 +203,7 @@
203 # Propagate the error if there is one.203 # Propagate the error if there is one.
204 return False if etype is not None else True204 return False if etype is not None else True
205205
206 def add_function(self, name, typ='normal'):206 def add_function(self, name, typ='normal', source='unknown'):
207 """207 """
208 Add a function.208 Add a function.
209209
@@ -219,7 +219,7 @@
219 pointer: we know only that the function exists, not its219 pointer: we know only that the function exists, not its
220 name or details.220 name or details.
221 """221 """
222 self.func_writer.write(name, typ)222 self.func_writer.write(name, typ, source)
223223
224 def add_call(self, caller, callee):224 def add_call(self, caller, callee):
225 """225 """
@@ -257,6 +257,19 @@
257 remote_paths:257 remote_paths:
258 A list of the paths of the remote fils.258 A list of the paths of the remote fils.
259 """259 """
260
261 def try_rmdir(path):
262 # Helper function to try and remove a directory, silently
263 # fail if it contains files, otherwise raise the exception.
264 try:
265 os.rmdir(path)
266 except OSError as e:
267 if e.errno in (os.errno.ENOTEMPTY, os.errno.ENOENT):
268 # Files in directory or directory doesn't exist.
269 pass
270 else:
271 raise e
272
260 print('Cleaning temporary files...', end='')273 print('Cleaning temporary files...', end='')
261 file_paths = list(itertools.chain(self.func_writer.file_iter(),274 file_paths = list(itertools.chain(self.func_writer.file_iter(),
262 self.call_writer.file_iter()))275 self.call_writer.file_iter()))
@@ -264,16 +277,9 @@
264 for path in file_paths: 277 for path in file_paths:
265 os.remove(path)278 os.remove(path)
266279
267 os.rmdir(self._tmp_dir)280 try_rmdir(self._tmp_dir)
268281 try_rmdir(TMP_DIR)
269 try:282
270 # If the parent sextant temp folder is empty, remove it.
271 os.rmdir(TMP_DIR)
272 except:
273 # There is other stuff in TMP_DIR (i.e. from other users), so
274 # leave it.
275 pass
276
277 self._ssh.remove_from_tmp_dir(remote_paths)283 self._ssh.remove_from_tmp_dir(remote_paths)
278284
279 print('done.')285 print('done.')
@@ -290,6 +296,7 @@
290296
291 tx.append('CREATE CONSTRAINT ON (p:program) ASSERT p.name IS UNIQUE')297 tx.append('CREATE CONSTRAINT ON (p:program) ASSERT p.name IS UNIQUE')
292 tx.append('CREATE INDEX ON :func(name)')298 tx.append('CREATE INDEX ON :func(name)')
299 tx.append('CREATE INDEX ON: func(file)')
293300
294 # Apply the transaction.301 # Apply the transaction.
295 tx.commit()302 tx.commit()
@@ -832,7 +839,7 @@
832 result = self._db.query(q, returns=neo4jrestclient.Node)839 result = self._db.query(q, returns=neo4jrestclient.Node)
833 return bool(result)840 return bool(result)
834841
835 def get_function_names(self, program_name, search, max_funcs):842 def get_function_names(self, program_name, search=None, max_funcs=None):
836 """843 """
837 Execute query to retrieve a list of all functions in the program.844 Execute query to retrieve a list of all functions in the program.
838 Any of the output names can be used verbatim in any SextantConnection845 Any of the output names can be used verbatim in any SextantConnection
@@ -845,15 +852,82 @@
845 if not validate_query(program_name):852 if not validate_query(program_name):
846 return set()853 return set()
847854
855 limit = "LIMIT {}".format(max_funcs) if max_funcs else ""
856
848 if not search:857 if not search:
849 q = (' MATCH (:program {{name: "{}"}})-[:subject]->(f:func)'858 q = (' MATCH (:program {{name: "{}"}})-[:subject]->(f:func)'
850 ' RETURN f.name LIMIT {}').format(program_name, max_funcs)859 ' RETURN f.name {}').format(program_name, limit)
851 else:860 else:
852 q = (' MATCH (:program {{name: "{}"}})-[:subject]->(f:func)'861 q = (' MATCH (:program {{name: "{}"}})-[:subject]->(f:func)'
853 ' WHERE f.name =~ ".*{}.*" RETURN f.name LIMIT {}'862 ' WHERE f.name =~ ".*{}.*" RETURN f.name {}'
854 .format(program_name, search, max_funcs))863 .format(program_name, search, limit))
855 return {func[0] for func in self._db.query(q)}864 return {func[0] for func in self._db.query(q)}
856865
866 @staticmethod
867 def get_query(identifier, search):
868 """
869 Builds a filter query from a search pattern which may contain commas
870 and/or wildcards.
871
872 Return:
873 string: part of a valid cypher query.
874 Arguments:
875 identifier:
876 The identifier of the node whose properties to filter on,
877 e.g. 'f' after a 'MATCH (f:func) ...'
878 search:
879 The pattern to build the search from, of form:
880 '<name patterns>:<path patterns>'
881 where patterns are possibly empty, possibly comma separated
882 lists of strings, which will be compared to the 'name' and
883 'file' (path) attributes of 'identifier'.
884
885 These strings may contain wildcards: e.g:
886 .*substring.*
887 sub.*string
888 etc.
889
890 """
891 if ':' in search:
892 func_subs, file_subs = search.split(':')
893 else:
894 func_subs, file_subs = search, ''
895
896 # Remove empty strings.
897 func_subs = [sub for sub in func_subs.split(',') if sub]
898 file_subs = [sub for sub in file_subs.split(',') if sub]
899
900 # Cases for search:
901 # <specific name>:<redundant stuff>
902 # <wildcard name>:<specific filepath>
903 # <wildcard name>:<wildcard filepath>
904
905 query_str = ""
906
907 def get_list(subs):
908 return '[{}]'.format(','.join("'{}'".format(s) for s in subs))
909
910
911 if func_subs and not any('*' in sub for sub in func_subs):
912 # List of specific functions. Don't care about anything after ':'
913 query_str += ('USING INDEX {0}:func(name) WHERE {0}.name IN {1} '
914 .format(identifier, get_list(func_subs)))
915 else:
916 if file_subs and not any('*' in sub for sub in file_subs):
917 # Specific file to look in.
918 query_str = ('USING INDEX {0}.func(file) WHERE {0}.file IN {1} '
919 .format(identifier, get_list(file_subs)))
920 elif file_subs:
921 query_str = ('WHERE ANY (s_file IN {} WHERE {}.file =~ s_file) '
922 .format(get_list(file_subs), identifier))
923
924 if func_subs:
925 query_str += 'AND ' if file_subs else 'WHERE '
926 query_str += ('ANY (s_name IN {} WHERE {}.name =~ s_name) '
927 .format(get_list(func_subs), identifier))
928
929 return query_str
930
857 def get_all_functions_called(self, program_name, function_calling):931 def get_all_functions_called(self, program_name, function_calling):
858 """932 """
859 Execute query to find all functions called by a function (indirectly).933 Execute query to find all functions called by a function (indirectly).
@@ -863,14 +937,9 @@
863 :param function_calling: string name of a function whose children to find937 :param function_calling: string name of a function whose children to find
864 :return: FunctionQueryResult, maximal subgraph rooted at function_calling938 :return: FunctionQueryResult, maximal subgraph rooted at function_calling
865 """939 """
866940 q = (' MATCH (p:program {{name: "{}"}})-[:subject]->(f:func) {}'
867 if not self.check_function_exists(program_name, function_calling):941 ' MATCH (f)-[:calls]->(g:func) RETURN distinct f, g'
868 return None942 .format(program_name, SextantConnection.get_query('f', function_calling)))
869
870 q = (' MATCH (p:program {{name: "{}"}})-[:subject]->(f:func {{name: "{}"}})'
871 ' USING INDEX f:func(name)'
872 ' MATCH (f)-[:calls*]->(g) RETURN distinct f, g'
873 .format(program_name, function_calling))
874943
875 return self._execute_query(program_name, q)944 return self._execute_query(program_name, q)
876945
@@ -884,14 +953,10 @@
884 :return: FunctionQueryResult, maximal connected subgraph with leaf function_called953 :return: FunctionQueryResult, maximal connected subgraph with leaf function_called
885 """954 """
886955
887 if not self.check_function_exists(program_name, function_called):956 q = (' MATCH (p:program {{name: "{}"}})-[:subject]->(g:func) {}'
888 return None957 ' MATCH (f)-[:calls]->(g)'
889958 ' RETURN distinct f, g')
890 q = (' MATCH (p:program {{name: "{}"}})-[:subject]->(g:func {{name: "{}"}})'959 q = q.format(program_name, SextantConnection.get_query('g', function_called), program_name)
891 ' USING INDEX g:func(name)'
892 ' MATCH (f)-[:calls*]->(g) WHERE f.name <> "{}"'
893 ' RETURN distinct f , g')
894 q = q.format(program_name, function_called, program_name)
895960
896 return self._execute_query(program_name, q)961 return self._execute_query(program_name, q)
897962
@@ -910,22 +975,17 @@
910 if not self.check_program_exists(program_name):975 if not self.check_program_exists(program_name):
911 return None976 return None
912977
913 if not self.check_function_exists(program_name, function_called):978 start_q = SextantConnection.get_query('start', function_calling)
914 return None979 end_q = SextantConnection.get_query('end', function_called)
915980
916 if not self.check_function_exists(program_name, function_calling):981 q = (' MATCH (p:program {{name: "{}"}})'
917 return None982 ' MATCH (p)-[:subject]->(start:func) {} WITH start, p'
918983 ' MATCH (p)-[:subject]->(end:func) {} WITH start, end'
919 q = (' MATCH (p:program {{name: "{}"}})-[:subject]->(start:func {{name: "{}"}})'
920 ' USING INDEX start:func(name)'
921 ' MATCH (p)-[:subject]->(end:func {{name: "{}"}})'
922 ' USING INDEX end:func(name)'
923 ' MATCH path=(start)-[:calls*]->(end)'984 ' MATCH path=(start)-[:calls*]->(end)'
924 ' WITH DISTINCT nodes(path) AS result'985 ' WITH DISTINCT nodes(path) AS result'
925 ' UNWIND result AS answer'986 ' UNWIND result AS answer'
926 ' RETURN answer')987 ' RETURN answer')
927 q = q.format(program_name, function_calling, function_called)988 q = q.format(program_name, start_q, end_q)
928
929 return self._execute_query(program_name, q)989 return self._execute_query(program_name, q)
930990
931 def get_whole_program(self, program_name):991 def get_whole_program(self, program_name):
@@ -942,7 +1002,7 @@
942 ' RETURN (f)'.format(program_name))1002 ' RETURN (f)'.format(program_name))
943 return self._execute_query(program_name, q)1003 return self._execute_query(program_name, q)
9441004
945 def get_shortest_path_between_functions(self, program_name, func1, func2):1005 def get_shortest_path_between_functions(self, program_name, function_calling, function_called):
946 """1006 """
947 Execute query to get a single, shortest, path between two functions.1007 Execute query to get a single, shortest, path between two functions.
948 :param program_name: string name of the program we wish to search under1008 :param program_name: string name of the program we wish to search under
@@ -953,17 +1013,16 @@
953 if not self.check_program_exists(program_name):1013 if not self.check_program_exists(program_name):
954 return None1014 return None
9551015
956 if not self.check_function_exists(program_name, func1):1016 start_q = SextantConnection.get_query('start', function_calling)
957 return None1017 end_q = SextantConnection.get_query('end', function_called)
9581018
959 if not self.check_function_exists(program_name, func2):1019 q = (' MATCH (p:program {{name: "{}"}})'
960 return None1020 ' MATCH (p)-[:subject]->(start:func) {} WITH start, p'
9611021 ' MATCH (p)-[:subject]->(end:func) {} WITH start, end'
962 q = (' MATCH (p:program {{name: "{}"}})-[:subject]->(f:func {{name: "{}"}})'1022 ' MATCH path=shortestPath((start)-[:calls*]->(end))'
963 ' USING INDEX f:func(name)'1023 ' UNWIND nodes(path) AS answer'
964 ' MATCH (p)-[:subject]->(g:func {{name: "{}"}})'1024 ' RETURN answer')
965 ' MATCH path=shortestPath((f)-[:calls*]->(g))'1025 q = q.format(program_name, start_q, end_q)
966 ' UNWIND nodes(path) AS ans'
967 ' RETURN ans'.format(program_name, func1, func2))
9681026
969 return self._execute_query(program_name, q)1027 return self._execute_query(program_name, q)
1028
9701029
=== modified file 'src/sextant/export.py'
--- src/sextant/export.py 2014-10-13 14:58:12 +0000
+++ src/sextant/export.py 2014-11-21 12:34:25 +0000
@@ -48,7 +48,7 @@
48 for func in program.get_functions():48 for func in program.get_functions():
49 if func.type == "stub":49 if func.type == "stub":
50 output_str += ' "{}" [fillcolor=pink, style=filled]\n'.format(func.name)50 output_str += ' "{}" [fillcolor=pink, style=filled]\n'.format(func.name)
51 elif func.type == "function_pointer":51 elif func.type == "pointer":
52 output_str += ' "{}" [fillcolor=yellow, style=filled]\n'.format(func.name)52 output_str += ' "{}" [fillcolor=yellow, style=filled]\n'.format(func.name)
5353
54 # in all cases, even if we've specified that we want a filled-in54 # in all cases, even if we've specified that we want a filled-in
5555
=== modified file 'src/sextant/objdump_parser.py' (properties changed: +x to -x)
--- src/sextant/objdump_parser.py 2014-10-23 11:15:48 +0000
+++ src/sextant/objdump_parser.py 2014-11-21 12:34:25 +0000
@@ -42,9 +42,12 @@
42 The number of function calls that have been parsed.42 The number of function calls that have been parsed.
43 function_ptr_count:43 function_ptr_count:
44 The number of function pointers that have been detected.44 The number of function pointers that have been detected.
45 _known_stubs:45 _known_functions:
46 A set of the names of functions with type 'stub' that have been46 A set of the names of functions that have been
47 parsed - used to avoid registering a stub multiple times.47 parsed - used to avoid registering a function multiple times.
48 _partial_functions:
49 A set of functions whose names we have seen but whose source
50 files we don't yet know.
4851
49 """52 """
50 def __init__(self, file_path, file_object=None, 53 def __init__(self, file_path, file_object=None,
@@ -102,13 +105,14 @@
102 self.call_count = 0105 self.call_count = 0
103 self.function_ptr_count = 0106 self.function_ptr_count = 0
104 107
105 # Avoid adding duplicate function stubs (as these are detected from108 # Avoid adding duplicate functions.
106 # function calls so may be repeated).109 self._known_functions = set()
107 self._known_stubs = set()110 # Set of partially-parsed functions.
111 self._partial_functions = set()
108112
109 # By default print information to stdout.113 # By default print information to stdout.
110 def print_func(name, typ):114 def print_func(name, typ, source='unknown'):
111 print('func {:25}{}'.format(name, typ))115 print('func {:25}{:15}{}'.format(name, typ, source))
112116
113 def print_call(caller, callee):117 def print_call(caller, callee):
114 print('call {:25}{:25}'.format(caller, callee))118 print('call {:25}{:25}'.format(caller, callee))
@@ -116,7 +120,6 @@
116 def print_started(parser):120 def print_started(parser):
117 print('parse started: {}[{}]'.format(self.path, ', '.join(self.sections)))121 print('parse started: {}[{}]'.format(self.path, ', '.join(self.sections)))
118122
119
120 def print_finished(parser):123 def print_finished(parser):
121 print('parsed {} functions and {} calls'.format(self.function_count, self.call_count))124 print('parsed {} functions and {} calls'.format(self.function_count, self.call_count))
122125
@@ -134,12 +137,32 @@
134 self.function_ptr_count += 1137 self.function_ptr_count += 1
135 return name138 return name
136139
137 def _add_function_normal(self, name):140 def _add_function(self, name, source=None):
138 """141 """
139 Add a function which we have full assembly code for.142 Add a partially known or fully known function.
140 """143 """
141 self.add_function(name, 'normal')144 if source is None:
142 self.function_count += 1145 # Partial definition - if do not already have a full definition
146 # for this name then add it to the partials set.
147 if not name in self._known_functions:
148 self._partial_functions.add(name)
149 elif source == 'unknown':
150 # Manually adding a stub function.
151 self.add_function(name, 'stub', source)
152 self.function_count += 1
153 elif name not in self._known_functions:
154 # A full definition - either upgrade from partial function
155 # to known function, or add directly to known functions
156 # (otherwise we have already seen it)
157
158 try:
159 self._partial_functions.remove(name)
160 except KeyError:
161 pass
162
163 self._known_functions.add(name)
164 self.add_function(name, 'normal', source)
165 self.function_count += 1
143166
144 def _add_function_ptr(self, name):167 def _add_function_ptr(self, name):
145 """168 """
@@ -148,15 +171,6 @@
148 self.add_function(name, 'pointer')171 self.add_function(name, 'pointer')
149 self.function_count += 1172 self.function_count += 1
150173
151 def _add_function_stub(self, name):
152 """
153 Add a function stub - we have its name but none of its internals.
154 """
155 if not name in self._known_stubs:
156 self._known_stubs.add(name)
157 self.add_function(name, 'stub')
158 self.function_count += 1
159
160 def _add_call(self, caller, callee):174 def _add_call(self, caller, callee):
161 """175 """
162 Add a function call from caller to callee.176 Add a function call from caller to callee.
@@ -171,10 +185,20 @@
171 self.started()185 self.started()
172186
173 if self._file is not None:187 if self._file is not None:
174 in_section = False # if we are in one of self.sections188 in_section = False # If we are in one of self.sections.
175 current_function = None # track the caller for function calls189 current_function = None # Track the caller for function calls.
190 to_add = False
176191
177 for line in self._file:192 for line in self._file:
193 if to_add:
194 file_line = line.startswith('/')
195 source = line.split(':')[0] if file_line else None
196 self._add_function(current_function, source)
197 to_add = False
198
199 if file_line:
200 continue
201
178 if line.startswith('Disassembly'):202 if line.startswith('Disassembly'):
179 # 'Disassembly of section <name>:\n'203 # 'Disassembly of section <name>:\n'
180 section = line.split(' ')[-1].rstrip(':\n')204 section = line.split(' ')[-1].rstrip(':\n')
@@ -189,12 +213,19 @@
189 # <function_name>[@plt]213 # <function_name>[@plt]
190 function_identifier = line.split('<')[-1].split('>')[0]214 function_identifier = line.split('<')[-1].split('>')[0]
191215
216 # IOS builds add a __be_ (big endian) prefix to all functions,
217 # get rid of it if it is there,
218 if function_identifier.startswith('__be_'):
219 function_identifier = function_identifier.lstrip('__be_')
220
192 if '@' in function_identifier:221 if '@' in function_identifier:
222 # Of form <function name>@<other stuff>.
193 current_function = function_identifier.split('@')[0]223 current_function = function_identifier.split('@')[0]
194 self._add_function_stub(current_function)224 self._add_function(current_function)
195 else:225 else:
196 current_function = function_identifier226 current_function = function_identifier
197 self._add_function_normal(current_function)227 # Flag function - we look for source on the next line.
228 to_add = True
198229
199 elif 'call ' in line or 'callq ' in line:230 elif 'call ' in line or 'callq ' in line:
200 # WHITESPACE to prevent picking up function names 231 # WHITESPACE to prevent picking up function names
@@ -213,9 +244,12 @@
213 # from which we extract name244 # from which we extract name
214 callee_is_ptr = False245 callee_is_ptr = False
215 function_identifier = callee_info.lstrip('<').rstrip('>\n')246 function_identifier = callee_info.lstrip('<').rstrip('>\n')
247 if function_identifier.startswith('__be_'):
248 function_identifier = function_identifier.lstrip('__be_')
249
216 if '@' in function_identifier:250 if '@' in function_identifier:
217 callee = function_identifier.split('@')[0]251 callee = function_identifier.split('@')[0]
218 self._add_function_stub(callee)252 self._add_function(callee)
219 else:253 else:
220 callee = function_identifier.split('-')[-1].split('+')[0]254 callee = function_identifier.split('-')[-1].split('+')[0]
221 # Do not add this fn now - it is a normal func255 # Do not add this fn now - it is a normal func
@@ -231,6 +265,10 @@
231 # Add the call.265 # Add the call.
232 if not (self.ignore_ptrs and callee_is_ptr):266 if not (self.ignore_ptrs and callee_is_ptr):
233 self._add_call(current_function, callee)267 self._add_call(current_function, callee)
268
269 for name in self._partial_functions:
270 self._add_function(name, 'unknown')
271
234 272
235 self.finished()273 self.finished()
236274
@@ -261,7 +299,7 @@
261 return result299 return result
262300
263301
264def run_objdump(input_file):302def run_objdump(input_file, add_file_paths=False):
265 """303 """
266 Run the objdump command on the file with the given path.304 Run the objdump command on the file with the given path.
267305
@@ -271,13 +309,24 @@
271 Arguments:309 Arguments:
272 input_file:310 input_file:
273 The path of the file to run objdump on.311 The path of the file to run objdump on.
312 add_file_paths:
313 Whether to call with -l option to extract line numbers and source
314 files from the binary. VERY SLOW on large binaries (~15 hours for ios).
274 315
275 """316 """
317 print('input file: {}'.format(input_file))
276 # A single section can be specified for parsing with the -j flag,318 # A single section can be specified for parsing with the -j flag,
277 # but it is not obviously possible to parse multiple sections like this.319 # but it is not obviously possible to parse multiple sections like this.
278 p = subprocess.Popen(['objdump', '-d', input_file, '--no-show-raw-insn'], 320 args = ['objdump', '-d', input_file, '--no-show-raw-insn']
279 stdout=subprocess.PIPE)321 if add_file_paths:
280 g = subprocess.Popen(['egrep', 'Disassembly|call(q)? |>:$'], stdin=p.stdout, stdout=subprocess.PIPE)322 args += ['--line-numbers']
323
324 p = subprocess.Popen(args, stdout=subprocess.PIPE)
325 # Egrep filters out the section headers (Disassembly of section...),
326 # the call lines (... [l]call[q] ...), the function declarations
327 # (... <function>:$) and the file paths (^/file_path).
328 g = subprocess.Popen(['egrep', 'Disassembly|call(q)? |>:$|^/'],
329 stdin=p.stdout, stdout=subprocess.PIPE)
281 return input_file, g.stdout330 return input_file, g.stdout
282331
283332
284333
=== modified file 'src/sextant/test_parser.py'
--- src/sextant/test_parser.py 2014-10-23 11:15:48 +0000
+++ src/sextant/test_parser.py 2014-11-21 12:34:25 +0000
@@ -23,7 +23,7 @@
23 calls = defaultdict(list)23 calls = defaultdict(list)
24 24
25 # set the Parser to put output in local dictionaries25 # set the Parser to put output in local dictionaries
26 add_function = lambda n, t: self.add_function(functions, n, t)26 add_function = lambda n, t, s='unknown': self.add_function(functions, n, t)
27 add_call = lambda a, b: self.add_call(calls, a, b)27 add_call = lambda a, b: self.add_call(calls, a, b)
2828
29 p = parser.Parser(path, sections=sections, ignore_ptrs=ignore_ptrs, 29 p = parser.Parser(path, sections=sections, ignore_ptrs=ignore_ptrs,
3030
=== modified file 'src/sextant/test_resources/parser_test'
31Binary files src/sextant/test_resources/parser_test 2014-10-13 14:10:01 +0000 and src/sextant/test_resources/parser_test 2014-11-21 12:34:25 +0000 differ31Binary files src/sextant/test_resources/parser_test 2014-10-13 14:10:01 +0000 and src/sextant/test_resources/parser_test 2014-11-21 12:34:25 +0000 differ
=== modified file 'src/sextant/update_db.py'
--- src/sextant/update_db.py 2014-10-17 14:20:06 +0000
+++ src/sextant/update_db.py 2014-11-21 12:34:25 +0000
@@ -20,7 +20,7 @@
20import logging20import logging
2121
22def upload_program(connection, user_name, file_path, program_name=None, 22def upload_program(connection, user_name, file_path, program_name=None,
23 not_object_file=False):23 not_object_file=False, add_file_paths=False):
24 """24 """
25 Upload a program's functions and call graph to the database.25 Upload a program's functions and call graph to the database.
2626
@@ -38,6 +38,9 @@
38 not_object_file:38 not_object_file:
39 Flag controlling whether file_path is pointing to a dump file or39 Flag controlling whether file_path is pointing to a dump file or
40 a binary file.40 a binary file.
41 add_file_paths:
42 Flag controlling whether to call objdump with the -l option to
43 extract line numbers and source files. VERY SLOW on large binaries.
41 """44 """
42 if not connection._ssh:45 if not connection._ssh:
43 raise SSHConnectionError('An SSH connection is required for '46 raise SSHConnectionError('An SSH connection is required for '
@@ -59,9 +62,9 @@
59 start = time()62 start = time()
6063
61 if not not_object_file:64 if not not_object_file:
62 print('Generating dump file...', end='')65 print('Generating dump file with{} file paths...'.format(('out', '')[add_file_paths]), end='')
63 sys.stdout.flush()66 sys.stdout.flush()
64 file_path, file_object = run_objdump(file_path)67 file_path, file_object = run_objdump(file_path, add_file_paths)
65 print('done.')68 print('done.')
66 else:69 else:
67 file_object = None70 file_object = None
@@ -82,15 +85,19 @@
82 print('done: {} functions and {} calls.'85 print('done: {} functions and {} calls.'
83 .format(parser.function_count, parser.call_count))86 .format(parser.function_count, parser.call_count))
8487
85 parser = Parser(file_path = file_path, file_object = file_object,88 parser = Parser(file_path=file_path, file_object = file_object,
86 sections=[],89 sections=[],
87 add_function = program.add_function,90 add_function=program.add_function,
88 add_call = program.add_call,91 add_call=program.add_call,
89 started=lambda parser: start_parser(program),92 started=lambda parser: start_parser(program),
90 finished=lambda parser: finish_parser(parser, program))93 finished=lambda parser: finish_parser(parser, program))
94
91 parser.parse()95 parser.parse()
9296
93 program.commit()97 if parser.function_count == 0:
98 print('Nothing to upload. Did you mean to add the --not-object-file flag?')
99 else:
100 program.commit()
94101
95 end = time()102 end = time()
96 print('Finished in {:.2f}s.'.format(end-start))103 print('Finished in {:.2f}s.'.format(end-start))
97104
=== modified file 'src/sextant/web/server.py'
--- src/sextant/web/server.py 2014-11-21 12:34:24 +0000
+++ src/sextant/web/server.py 2014-11-21 12:34:25 +0000
@@ -13,6 +13,8 @@
13from twisted.internet.threads import deferToThread13from twisted.internet.threads import deferToThread
14from twisted.internet import defer14from twisted.internet import defer
1515
16from neo4jrestclient.exceptions import TransactionException
17
16import logging18import logging
17import os19import os
18import json20import json
@@ -24,6 +26,8 @@
24import tempfile26import tempfile
25import subprocess27import subprocess
2628
29from datetime import datetime
30
27from cgi import escape # deprecated in Python 3 in favour of html.escape, but we're stuck on Python 231from cgi import escape # deprecated in Python 3 in favour of html.escape, but we're stuck on Python 2
2832
29# global SextantConnection object which deals with the port forwarding33# global SextantConnection object which deals with the port forwarding
@@ -174,13 +178,15 @@
174 # if we are okay here we have a valid query with all required arguments178 # if we are okay here we have a valid query with all required arguments
175 if res_code is RESPONSE_CODE_OK:179 if res_code is RESPONSE_CODE_OK:
176 try:180 try:
181 print('running query {}'.format(datetime.now()))
177 program = yield defer_to_thread_with_timeout(render_timeout, fn,182 program = yield defer_to_thread_with_timeout(render_timeout, fn,
178 name, *req_args)183 name, *req_args)
179 except defer.CancelledError:184 print('\tdone {}'.format(datetime.now()))
185 except Exception as e:
180 # the timeout has fired and cancelled the request186 # the timeout has fired and cancelled the request
181 res_code = RESPONSE_CODE_BAD_REQUEST187 res_code = RESPONSE_CODE_BAD_REQUEST
182 res_fmt = "The request timed out after {} seconds."188 res_msg = "{}".format(e)
183 res_msg = res_fmt.format(render_timeout)189 print('\tfailed {}'.format(datetime.now()))
184 190
185 if res_code is RESPONSE_CODE_OK:191 if res_code is RESPONSE_CODE_OK:
186 # we have received a response to our request192 # we have received a response to our request
@@ -201,10 +207,12 @@
201 suppress_common = suppress_common_arg in ('null', 'true')207 suppress_common = suppress_common_arg in ('null', 'true')
202208
203 # we have a non-empty return - render it209 # we have a non-empty return - render it
210 print('getting plot {}'.format(datetime.now()))
204 res_msg = yield deferToThread(self.get_plot, program, 211 res_msg = yield deferToThread(self.get_plot, program,
205 suppress_common, 212 suppress_common,
206 remove_self_calls=False)213 remove_self_calls=False)
207 request.setHeader('content-type', 'image/svg+xml')214 request.setHeader('content-type', 'image/svg+xml')
215 print('\tdone {}'.format(datetime.now()))
208216
209 request.setResponseCode(res_code)217 request.setResponseCode(res_code)
210 request.write(res_msg)218 request.write(res_msg)
@@ -229,6 +237,7 @@
229 max_funcs = AUTOCOMPLETE_NAMES_LIMIT + 1237 max_funcs = AUTOCOMPLETE_NAMES_LIMIT + 1
230 programs = CONNECTION.programs_with_metadata()238 programs = CONNECTION.programs_with_metadata()
231 result = CONNECTION.get_function_names(program_name, search, max_funcs)239 result = CONNECTION.get_function_names(program_name, search, max_funcs)
240 print(search, len(result))
232 return result if len(result) < max_funcs else set()241 return result if len(result) < max_funcs else set()
233 242
234243

Subscribers

People subscribed via source and target branches